Algorithms for Trip-Vehicle Assignment in Ride-Sharing

摘要

We investigate the ride-sharing assignment problem from an algorithmic resource allocation point of view. Given a number of requests with source and destination locations, and a number of available car locations, the task is to assign cars to requests with two requests sharing one car. We formulate this as a combinatorial optimization problem, and show that it is NP-hard. We then design an approximation algorithm which guarantees to output a solution with at most 2.5 times the optimal cost. Experiments are conducted showing that our algorithm actually has a much better approximation ratio (around 1.2) on synthetically generated data.

关键词

全文

PDF


EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

摘要

Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples — a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on L2 and L∞ distortion metrics. However, despite the fact that L1 distortion accounts for the total variation and encourages sparsity in the perturbation, little has been developed for crafting L1-based adversarial examples. In this paper, we formulate the process of attacking DNNs via adversarial examples as an elastic-net regularized optimization problem. Our elastic-net attacks to DNNs (EAD) feature L1-oriented adversarial examples and include the state-of-the-art L2 attack as a special case. Experimental results on MNIST, CIFAR10 and ImageNet show that EAD can yield a distinct set of adversarial examples with small L1 distortion and attains similar attack performance to the state-of-the-art methods in different attack scenarios. More importantly, EAD leads to improved attack transferability and complements adversarial training for DNNs, suggesting novel insights on leveraging L1 distortion in adversarial machine learning and security implications of DNNs.

关键词

adversarial machine learning; elastic-net optimization; adversarial example; neural network; robustness

全文

PDF


Learning Differences Between Visual Scanning Patterns Can Disambiguate Bipolar and Unipolar Patients

摘要

Bipolar Disorder (BD) and Major Depressive Disorder (MDD) are two common and debilitating mood disorders. Misdiagnosing BD as MDD is relatively common and the introduction of markers to improve diagnostic accuracy early in the course of the illness has been identified as one of the top unmet needs in the field. In this paper, we present novel methods to differentiate between BD and MDD patients. The methods use deep learning techniques to quantify differences between visual scanning patterns of BD and MDD patients. In the methods, visual scanning patterns that are described by ordered sequences of fixations on emotional faces are encoded into a lower dimensional space and are fed into a long-short term memory recurrent neural network (RNN). Fixation sequences are encoded by three different methods: 1) using semantic regions of interests (RoIs) that are manually defined by experts, 2) using semi-automatically defined grids of RoIs, or 3) using a convolutional neural network (CNN) to automatically extract visual features from saliency maps. Using data from 47 patients with MDD and 26 patients with BD we showed that using semantic RoIs, the RNN improved the performance of a baseline classifier from an AUC of 0.603 to an AUC of 0.878. Similarly using grid RoIs, the RNN improved the performance of a baseline classifier from an AUC of 0.450 to an AUC of 0.828. The classifier that automatically extracted visual features from saliency maps (a long recurrent convolutional network that is fully data-driven) had an AUC of 0.879. The results of the study suggest that by using RNNs to learn differences between fixation sequences the diagnosis of individual patients with BD or MDD can be disambiguated with high accuracy. Moreover, by using saliency maps and CNN to encode the fixation sequences the method can be fully automated and achieve high accuracy without relying on user expertise and/or manual labelling. When compared with other markers, the performance of the class of classifiers that was introduced in this paper is better than that of detectors that use differences in neural structures, neural activity or cortical hemodynamics to differentiate between BD and MDD patients. The novel use of RNNs to quantify differences between fixation sequences of patients with mood disorders can be easily generalized to studies of other neuropsychological disorders and to other fields such as psychology and advertising.

关键词

Bipolar disorder; major depressive disorder; RNN; LRCN; visual scanning; fixation sequences

全文

PDF


Comparing Population Means Under Local Differential Privacy: With Significance and Power

摘要

A statistical hypothesis test determines whether a hypothesis should be rejected based on samples from populations. In particular, randomized controlled experiments (or A/B testing) that compare population means using, e.g., t-tests, have been widely deployed in technology companies to aid in making data-driven decisions. Samples used in these tests are collected from users and may contain sensitive information. Both the data collection and the testing process may compromise individuals’ privacy. In this paper, we study how to conduct hypothesis tests to compare population means while preserving privacy. We use the notation of local differential privacy (LDP), which has recently emerged as the main tool to ensure each individual’s privacy without the need of a trusted data collector. We propose LDP tests that inject noise into every user’s data in the samples before collecting them (so users do not need to trust the data collector), and draw conclusions with bounded type-I (significance level) and type-II errors (1 - power). Our approaches can be extended to the scenario where some users require LDP while some are willing to provide exact data. We report experimental results on real-world datasets to verify the effectiveness of our approaches.

关键词

Local model of differential privacy; statistical hypothesis test; significance level; statistical power

全文

PDF


MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

摘要

Generating music has a few notable differences from generating images and videos. First, music is an art of time, necessitating a temporal model. Second, music is usually composed of multiple instruments/tracks with their own temporal dynamics, but collectively they unfold over time interdependently. Lastly, musical notes are often grouped into chords, arpeggios or melodies in polyphonic music, and thereby introducing a chronological ordering of notes is not naturally suitable. In this paper, we propose three models for symbolic multi-track music generation under the framework of generative adversarial networks (GANs). The three models, which differ in the underlying assumptions and accordingly the network architectures, are referred to as the jamming model, the composer model and the hybrid model. We trained the proposed models on a dataset of over one hundred thousand bars of rock music and applied them to generate piano-rolls of five tracks: bass, drums, guitar, piano and strings. A few intra-track and inter-track objective metrics are also proposed to evaluate the generative results, in addition to a subjective user study. We show that our models can generate coherent music of four bars right from scratch (i.e. without human inputs). We also extend our models to human-AI cooperative music generation: given a specific track composed by human, we can generate four additional tracks to accompany it. All code, the dataset and the rendered audio samples are available at https://salu133445.github.io/musegan/.

关键词

全文

PDF


Picasso, Matisse, or a Fake? Automated Analysis of Drawings at the Stroke Level for Attribution and Authentication

摘要

This paper proposes a computational approach for analysis of strokes in line drawings by artists. We aim at developing an AI methodology that facilitates attribution of drawings of unknown authors in a way that is not easy to be deceived by forged art. The methodology used is based on quantifying the characteristics of individual strokes in drawings. We propose a novel algorithm for segmenting individual strokes. We propose an approach that combines different hand-crafted and learned features for the task of quantifying stroke characteristics. We experimented with a dataset of 300 digitized drawings with over 80 thousands strokes. The collection mainly consisted of drawings of Pablo Picasso, Henry Matisse, and Egon Schiele, besides a small number of representative works of other artists. The experiments shows that the proposed methodology can classify individual strokes with accuracy 70%-90%, and aggregate over drawings with accuracy above 80%, while being robust to be deceived by fakes.

关键词

Deep Neural Networks, Art authentication,

全文

PDF


Beyond Distributive Fairness in Algorithmic Decision Making: Feature Selection for Procedurally Fair Learning

摘要

With widespread use of machine learning methods in numerous domains involving humans, several studies have raised questions about the potential for unfairness towards certain individuals or groups. A number of recent works have proposed methods to measure and eliminate unfairness from machine learning models. However, most of this work has focused on only one dimension of fair decision making: distributive fairness, i.e., the fairness of the decision outcomes. In this work, we leverage the rich literature on organizational justice and focus on another dimension of fair decision making: procedural fairness, i.e., the fairness of the decision making process. We propose measures for procedural fairness that consider the input features used in the decision process, and evaluate the moral judgments of humans regarding the use of these features. We operationalize these measures on two real world datasets using human surveys on the Amazon Mechanical Turk (AMT) platform, demonstrating that our measures capture important properties of procedurally fair decision making. We provide fast submodular mechanisms to optimize the tradeoff between procedural fairness and prediction accuracy. On our datasets, we observe empirically that procedural fairness may be achieved with little cost to outcome fairness, but that some loss of accuracy is unavoidable.

关键词

Fair decision making; Machine learning and law; Discrimination in classification

全文

PDF


Distributed Composite Quantization

摘要

Approximate nearest neighbor (ANN) search is a fundamental problem in computer vision, machine learning and information retrieval. Recently, quantization-based methods have drawn a lot of attention due to their superior accuracy and comparable efficiency compared with traditional hashing techniques. However, despite the prosperity of quantization techniques, they are all designed for the centralized setting, i.e., quantization is performed on the data on a single machine. This makes it difficult to scale these techniques to large-scale datasets. Built upon the Composite Quantization, we propose a novel quantization algorithm for data dis- tributed across different nodes of an arbitrary network. The proposed Distributed Composite Quantization (DCQ) decom-poses Composite Quantization into a set of decentralized sub-problems such that each node solves its own sub-problem on its local data, meanwhile is still able to attain consistent quantizers thanks to the consensus constraint. Since there is no exchange of training data across the nodes in the learning process, the communication cost of our method is low. Ex- tensive experiments on ANN search and image retrieval tasks validate that the proposed DCQ significantly improves Composite Quantization in both efficiency and scale, while still maintaining competitive accuracy.

关键词

全文

PDF


Tensorized Projection for High-Dimensional Binary Embedding

摘要

Embedding high-dimensional visual features (d-dimensional) to binary codes (b-dimensional) has shown advantages in various vision tasks such as object recognition and image retrieval. Meanwhile, recent works have demonstrated that to fully utilize the representation power of high-dimensional features, it is critical to encode them into long binary codes rather than short ones, i.e., b ~ O(d). However, generating long binary codes involves large projection matrix and high-dimensional matrix-vector multiplication, thus is memory and computationally intensive. To tackle these problems, we propose Tensorized Projection (TP) to decompose the projection matrix using Tensor-Train (TT) format, which is a chain-like representation that allows to operate tensor in an efficient manner. As a result, TP can drastically reduce the computational complexity and memory cost. Moreover, by using the TT-format, TP can regulate the projection matrix against the risk of over-fitting, consequently, lead to better performance than using either dense projection matrix (like ITQ) or sparse projection matrix. Experimental comparisons with state-of-the-art methods over various visual tasks demonstrate both the efficiency and performance ad- vantages of our proposed TP, especially when generating high dimensional binary codes, e.g., when b ≥ d.

关键词

全文

PDF


Predicting Aesthetic Score Distribution Through Cumulative Jensen-Shannon Divergence

摘要

Aesthetic quality prediction is a challenging task in the computer vision community because of the complex interplay with semantic contents and photographic technologies. Recent studies on the powerful deep learning based aesthetic quality assessment usually use a binary high-low label or a numerical score to represent the aesthetic quality. However the scalar representation cannot describe well the underlying varieties of the human perception of aesthetics. In this work, we propose to predict the aesthetic score distribution (i.e., a score distribution vector of the ordinal basic human ratings) using Deep Convolutional Neural Network (DCNN). Conventional DCNNs which aim to minimize the difference between the predicted scalar numbers or vectors and the ground truth cannot be directly used for the ordinal basic rating distribution. Thus, a novel CNN based on the Cumulative distribution with Jensen-Shannon divergence (CJS-CNN) is presented to predict the aesthetic score distribution of human ratings, with a new reliability-sensitive learning method based on the kurtosis of the score distribution, which eliminates the requirement of the original full data of human ratings (without normalization). Experimental results on large scale aesthetic dataset demonstrate the effectiveness of our introduced CJS-CNN in this task.

关键词

Image Aesthetic Assessment; Score Distribution; Cumulative Jensen-Shannon Divergence

全文

PDF


Norm Conflict Resolution in Stochastic Domains

摘要

Artificial agents will need to be aware of human moral and social norms, and able to use them in decision-making. In particular, artificial agents will need a principled approach to managing conflicting norms, which are common in human social interactions. Existing logic-based approaches suffer from normative explosion and are typically designed for deterministic environments; reward-based approaches lack principled ways of determining which normative alternatives exist in a given environment. We propose a hybrid approach, using Linear Temporal Logic (LTL) representations in Markov Decision Processes (MDPs), that manages norm conflicts in a systematic manner while accommodating domain stochasticity. We provide a proof-of-concept implementation in a simulated vacuum cleaning domain.

关键词

normative systems; Markov Decision Processes; Linear Temporal Logic; norm conflict

全文

PDF


Deep Representation-Decoupling Neural Networks for Monaural Music Mixture Separation

摘要

Monaural source separation (MSS) aims to extract and reconstruct different sources from a single-channel mixture, which could facilitate a variety of applications such as chord recognition, pitch estimation and automatic transcription. In this paper, we study the problem of separating vocals and instruments from monaural music mixture. Existing works for monaural source separation either utilize linear and shallow models (e.g., non-negative matrix factorization), or do not explicitly address the coupling and tangling of multiple sources in original input signals, hence they do not perform satisfactorily in real-world scenarios. To overcome the above limitations, we propose a novel end-to-end framework for monaural music mixture separation called Deep Representation-Decoupling Neural Networks (DRDNN). DRDNN takes advantages of both traditional signal processing methods and popular deep learning models. For each input of music mixture, DRDNN converts it to a two-dimensional time-frequency spectrogram using short-time Fourier transform (STFT), followed by stacked convolutional neural networks (CNN) layers and long-short term memory (LSTM) layers to extract more condensed features. Afterwards, DRDNN utilizes a decoupling component, which consists of a group of multi-layer perceptrons (MLP), to decouple the features further into different separated sources. The design of decoupling component in DRDNN produces purified single-source signals for subsequent full-size restoration, and can significantly improve the performance of final separation. Through extensive experiments on real-world dataset, we prove that DRDNN outperforms state-of-the-art baselines in the task of monaural music mixture separation and reconstruction.

关键词

全文

PDF


Early Prediction of Diabetes Complications from Electronic Health Records: A Multi-Task Survival Analysis Approach

摘要

Type 2 diabetes mellitus (T2DM) is a chronic disease that usually results in multiple complications. Early identification of individuals at risk for complications after being diagnosed with T2DM is of significant clinical value. In this paper, we present a new data-driven predictive approach to predict when a patient will develop complications after the initial T2DM diagnosis. We propose a novel survival analysis method to model the time-to-event of T2DM complications designed to simultaneously achieve two important metrics: 1) accurate prediction of event times, and 2) good ranking of the relative risks of two patients. Moreover, to better capture the correlations of time-to-events of the multiple complications, we further develop a multi-task version of the survival model. To assess the performance of these approaches, we perform extensive experiments on patient level data extracted from a large electronic health record claims database. The results show that our new proposed survival analysis approach consistently outperforms traditional survival models and demonstrate the effectiveness of the multi-task framework over modeling each complication independently.

关键词

Healthcare; Diabetes; EHR; Survival Analysis; Multi-task Learning

全文

PDF


Learning the Joint Representation of Heterogeneous Temporal Events for Clinical Endpoint Prediction

摘要

The availability of a large amount of electronic health records (EHR) provides huge opportunities to improve health care service by mining these data. One important application is clinical endpoint prediction, which aims to predict whether a disease, a symptom or an abnormal lab test will happen in the future according to patients' history records. This paper develops deep learning techniques for clinical endpoint prediction, which are effective in many practical applications. However, the problem is very challenging since patients' history records contain multiple heterogeneous temporal events such as lab tests, diagnosis, and drug administrations. The visiting patterns of different types of events vary significantly, and there exist complex nonlinear relationships between different events. In this paper, we propose a novel model for learning the joint representation of heterogeneous temporal events. The model adds a new gate to control the visiting rates of different events which effectively models the irregular patterns of different events and their nonlinear correlations. Experiment results with real-world clinical data on the tasks of predicting death and abnormal lab tests prove the effectiveness of our proposed approach over competitive baselines.

关键词

Electronic Health Record; Sequential Data Modeling; Deep Learning

全文

PDF


Multi-View Multi-Graph Embedding for Brain Network Clustering Analysis

摘要

Network analysis of human brain connectivity is critically important for understanding brain function and disease states. Embedding a brain network as a whole graph instance into a meaningful low-dimensional representation can be used to investigate disease mechanisms and inform therapeutic interventions. Moreover, by exploiting information from multiple neuroimaging modalities or views, we are able to obtain an embedding that is more useful than the embedding learned from an individual view. Therefore, multi-view multi-graph embedding becomes a crucial task. Currently only a few studies have been devoted to this topic, and most of them focus on vector-based strategy which will cause structural information contained in the original graphs lost. As a novel attempt to tackle this problem, we propose Multi-view Multi-graph Embedding M2E by stacking multi-graphs into multiple partially-symmetric tensors and using tensor techniques to simultaneously leverage the dependencies and correlations among multi-view and multi-graph brain networks. Extensive experiments on real HIV and bipolar disorder brain network datasets demonstrate the superior performance of M2E on clustering brain networks by leveraging the multi-view multi-graph interactions.

关键词

Brain Network Embedding;Multi-view Learning;Tensor Factorization

全文

PDF


Uplink Communication Efficient Differentially Private Sparse Optimization With Feature-Wise Distributed Data

摘要

Preserving differential privacy during empirical risk minimization model training has been extensively studied under centralized and sample-wise distributed dataset settings. This paper considers a nearly unexplored context with features partitioned among different parties under privacy restriction. Motivated by the nearly optimal utility guarantee achieved by centralized private Frank-Wolfe algorithm (Talwar, Thakurta, and Zhang 2015), we develop a distributed variant with guaranteed privacy, utility and uplink communication complexity. To obtain these guarantees, we provide a much generalized convergence analysis for block-coordinate Frank-Wolfe under arbitrary sampling, which greatly extends known convergence results that are only applicable to two specific block sampling distributions. We also design an active feature sharing scheme by utilizing private Johnson-Lindenstrauss transform, which is the key to updating local partial gradients in a differentially private and communication efficient manner.

关键词

Differential Privacy,Frank-Wolfe Algorithm,Distributed Sparse Optimization

全文

PDF


Learning the Probability of Activation in the Presence of Latent Spreaders

摘要

When an infection spreads in a community, an individual's probability of becoming infected depends on both her susceptibility and exposure to the contagion through contact with others. While one often has knowledge regarding an individual's susceptibility, in many cases, whether or not an individual's contacts are contagious is unknown.We study the problem of predicting if an individual will adopt a contagion in the presence of multiple modes of infection (exposure/susceptibility) and latent neighbor influence. We present a generative probabilistic model and a variational inference method to learn the parameters of our model. Through a series of experiments on synthetic data, we measure the ability of the proposed model to identify latent spreaders, and predict the risk of infection. Applied to a real dataset of 20,000 hospital patients, we demonstrate the utility of our model in predicting the onset of a healthcare associated infection using patient room-sharing and nurse-sharing networks. Our model outperforms existing benchmarks and provides actionable insights for the design and implementation of targeted interventions to curb the spread of infection.

关键词

healthcare; infectious diseases; probabilistic graphical model; variational inference

全文

PDF


Generating an Event Timeline About Daily Activities From a Semantic Concept Stream

摘要

Recognizing activities of daily living (ADLs) in the real world is an important task for understanding everyday human life. However, even though our life events consist of chronological ADLs with the corresponding places and objects (e.g., drinking coffee in the living room after making coffee in the kitchen and walking to the living room), most existing works focus on predicting individual activity labels from sensor data. In this paper, we introduce a novel framework that produces an event timeline of ADLs in a home environment. The proposed method combines semantic concepts such as action, object, and place detected by sensors for generating stereotypical event sequences with the following three real-world properties. First, we use temporal interactions among concepts to remove objects and places unrelated to each action. Second, we use commonsense knowledge mined from a language resource to find a possible combination of concepts in the real world. Third, we use temporal variations of events to filter repetitive events, since our daily life changes over time. We use cross-place validation to evaluate our proposed method on a daily-activities dataset with manually labeled event descriptions. The empirical evaluation demonstrates that our method using real-world properties improves the performance of generating an event timeline over diverse environments.

关键词

Activity Recognition; Event Timeline Generation; Activities of Daily Living; Semantic Concept Stream

全文

PDF


On Organizing Online Soirees with Live Multi-Streaming

摘要

The popularity of live streaming has led to the explosive growth in new video contents and social communities on emerging platforms such as Facebook Live and Twitch. Viewers on these platforms are able to follow multiple streams of live events simultaneously, while engaging discussions with friends. However, existing approaches for selecting live streaming channels still focus on satisfying individual preferences of users, without considering the need to accommodate real-time social interactions among viewers and to diversify the content of streams. In this paper, therefore, we formulate a new Social-aware Diverse and Preferred Live Streaming Channel Query (SDSQ) that jointly selects a set of diverse and preferred live streaming channels and a group of socially tight viewers. We prove that SDSQ is NP-hard and inapproximable within any factor, and design SDSSel, a 2-approximation algorithm with a guaranteed error bound. We perform a user study on Twitch with 432 participants to validate the need of SDSQ and the usefulness of SDSSel. We also conduct large-scale experiments on real datasets to demonstrate the superiority of the proposed algorithm over several baselines in terms of solution quality and efficiency.

关键词

全文

PDF


An AI Planning Solution to Scenario Generation for Enterprise Risk Management

摘要

Scenario planning is a commonly used method by companies to develop their long-term plans. Scenario planning for risk management puts an added emphasis on identifying and managing emerging risk. While a variety of methods have been proposed for this purpose, we show that applying AI planning techniques to devise possible scenarios provides a unique advantage for scenario planning. Our system, the Scenario Planning Advisor (SPA), takes as input the relevant information from news and social media, representing key risk drivers, as well as the domain knowledge and generates scenarios that explain the key risk drivers and describe the alternative futures. To this end, we provide a characterization of the problem, knowledge engineering methodology, and transformation to planning. Furthermore, we describe the computation of the scenarios, lessons learned, and the feedback received from the pilot deployment of the SPA system in IBM.

关键词

Applications; Planning and Scheduling; Activity and Plan Recognition; Model-Based Reasoning; Risk Management

全文

PDF


Automated Segmentation of Overlapping Cytoplasm in Cervical Smear Images via Contour Fragments

摘要

We present a novel method for automated segmentation of overlapping cytoplasm in cervical smear images based on contour fragments. We formulate the segmentation problem as a graphical model, and employ the contour fragments generated from cytoplasm clump to construct the graph. Compared with traditional methods that are based on pixels, our contour fragment-based solution can take more geometric information into account and hence generate more accurate prediction of the overlapping boundaries. We further design a novel energy function for the graph, and by minimizing the energy function, fragments that come from the same cytoplasm are selected into the same set. To construct the energy function, our fragments-based data term and pairwise term are measured from the spatial relation and shape prior, which offer more geometric information for the occluded boundary inference. Afterwards, occluded boundaries are inferred using the minimal path model, in which shape of each individual cytoplasm is reconstructed on the selected fragments set. Constructed shape is used as a constraint to locate the searching area, and curvature regulation is enforced to promote the smoothness of inference result. The inference result, in turn, is used as the shape prior to construct a high-level shape regulation energy term of the built graph, and then graph energy is updated. In other words, fragments selection and occluded boundary inference are iterative processed; this interaction makes more potential shape information accessible. Using two cervical smear datasets, the performance of our method is extensively evaluated and compared with that of the state-of-the-art approaches; the results show the superiority of the proposed method.

关键词

cervical cancer; overlapping objects segmentation; fragments-based graphical model

全文

PDF


When Social Advertising Meets Viral Marketing: Sequencing Social Advertisements for Influence Maximization

摘要

Recent studies reveal that social advertising is more effective than conventional online advertising. This is mainly because conventional advertising targets at individual's interest while social advertising is able to produce a large cascade of further exposures to other users via social influence. This motivates us to study the optimal social advertising problem from platform's perspective, and our objective is to find the best ad sequence for each user in order to maximize the expected revenue. Although there is rich body of work that has been devoted to ad sequencing, the network value of each customer is largely ignored in existing algorithm design. To fill this gap, we propose to integrate viral marketing into existing ad sequencing model, and develop both non-adaptive and adaptive ad sequencing policies that can maximize the viral marketing efficiency.

关键词

全文

PDF


Synthesis of Programs from Multimodal Datasets

摘要

We describe MultiSynth, a framework for synthesizing domain-specific programs from a multimodal dataset of examples. Given a domain-specific language (DSL), a dataset is multimodal if there is no single program in the DSL that generalizes over all the examples. Further, even if the examples in the dataset were generalized in terms of a set of programs, the domains of these programs may not be disjoint, thereby leading to ambiguity in synthesis. MultiSynth is a framework that incorporates concepts of synthesizing programs with minimum generality, while addressing the need of accurate prediction. We show how these can be achieved through (i) transformation driven partitioning of the dataset, (ii) least general generalization, for a generalized specification of the input and the output, and (iii) learning to rank, for estimating feature weights in order to map an input to the most appropriate mode in case of ambiguity. We show the effectiveness of our framework in two domains: in the first case, we extend an existing approach for synthesizing programs for XML tree transformations to ambiguous multimodal datasets. In the second case, MultiSynth is used to preorder words for machine translation, by learning permutations of productions in the parse trees of the source side sentences. Our evaluations reflect the effectiveness of our approach.

关键词

APP: Other Applications; MLA: Applications of Supervised Learning; HSO: Optimization; NLPML: Natural Language Processing (General/Other); HSO: Search (General/Other)

全文

PDF


CD-CNN: A Partially Supervised Cross-Domain Deep Learning Model for Urban Resident Recognition

摘要

Driven by the wave of urbanization in recent decades, the research topic about migrant behavior analysis draws great attention from both academia and the government. Nevertheless, subject to the cost of data collection and the lack of modeling methods, most of existing studies use only questionnaire surveys with sparse samples and non-individual level statistical data to achieve coarse-grained studies of migrant behaviors. In this paper, a partially supervised cross-domain deep learning model named CD-CNN is proposed for migrant/native recognition using mobile phone signaling data as behavioral features and questionnaire survey data as incomplete labels. Specifically, CD-CNN features in decomposing the mobile data into location domain and communication domain, and adopts a joint learning framework that combines two convolutional neural networks with a feature balancing scheme. Moreover, CD-CNN employs a three-step algorithm for training, in which the co-training step is of great value to partially supervised cross-domain learning. Comparative experiments on the city Wuxi demonstrate the high predictive power of CD-CNN. Two interesting applications further highlight the ability of CD-CNN for in-depth migrant behavioral analysis.

关键词

全文

PDF


Geographic Differential Privacy for Mobile Crowd Coverage Maximization

摘要

For real-world mobile applications such as location-based advertising and spatial crowdsourcing, a key to success is targeting mobile users that can maximally cover certain locations in a future period. To find an optimal group of users, existing methods often require information about users' mobility history, which may cause privacy breaches. In this paper, we propose a method to maximize mobile crowd's future location coverage under a guaranteed location privacy protection scheme. In our approach, users only need to upload one of their frequently visited locations, and more importantly, the uploaded location is obfuscated using a geographic differential privacy policy. We propose both analytic and practical solutions to this problem. Experiments on real user mobility datasets show that our method significantly outperforms the state-of-the-art geographic differential privacy methods by achieving a higher coverage under the same level of privacy protection.

关键词

location privacy; differential privacy; mobility

全文

PDF


Catching Captain Jack: Efficient Time and Space Dependent Patrols to Combat Oil-Siphoning in International Waters

摘要

Pirate syndicates capturing tankers to siphon oil, causing an estimated cost of $5 billion a year, has become a serious security issue for maritime traffic. In response to the threat, coast guards and navies deploy patrol boats to protect international oil trade. However, given the vast area of the sea and the highly time and space dependent behaviors of both players, it remains a significant challenge to find efficient ways to deploy patrol resources. In this paper, we address the research challenges and provide four key contributions. First, we construct a Stackelberg model of the oil-siphoning problem based on incident reports of actual attacks; Second, we propose a compact formulation and a constraint generation algorithm, which tackle the exponentially growth of the defender’s and attacker’s strategy spaces, respectively, to compute efficient strategies of security agencies; Third, to further improve the scalability, we propose an abstraction method, which exploits the intrinsic similarity of defender’s strategy space, to solve extremely large-scale games; Finally, we evaluate our approaches through extensive simulations and a detailed case study with real ship traffic data. The results demonstrate that our approach achieves a dramatic improvement of scalability with modest influence on the solution quality and can scale up to realistic-sized problems.

关键词

Game theory; Security and Privacy

全文

PDF


Video Summarization via Semantic Attended Networks

摘要

The goal of video summarization is to distill a raw video into a more compact form without losing much semantic information. However, previous methods mainly consider the diversity and representation interestingness of the obtained summary, and they seldom pay sufficient attention to semantic information of resulting frame set, especially the long temporal range semantics. To explicitly address this issue, we propose a novel technique which is able to extract the most semantically relevant video segments (i.e., valid for a long term temporal duration) and assemble them into an informative summary. To this end, we develop a semantic attended video summarization network (SASUM) which consists of a frame selector and video descriptor to select an appropriate number of video shots by minimizing the distance between the generated description sentence of the summarized video and the human annotated text of the original video. Extensive experiments show that our method achieves a superior performance gain over previous methods on two benchmark datasets.

关键词

全文

PDF


TIMERS: Error-Bounded SVD Restart on Dynamic Networks

摘要

Singular Value Decomposition (SVD) is a popular approach in various network applications, such as link prediction and network parameter characterization. Incremental SVD approaches are proposed to process newly changed nodes and edges in dynamic networks. However, incremental SVD approaches suffer from serious error accumulation inevitably due to approximation on incremental updates. SVD restart is an effective approach to reset the aggregated error, but when to restart SVD for dynamic networks is not addressed in literature. In this paper, we propose TIMERS, Theoretically Instructed Maximum-Error-bounded Restart of SVD, a novel approach which optimally sets the restart time in order to reduce error accumulation in time. Specifically, we monitor the margin between reconstruction loss of incremental updates and the minimum loss in SVD model. To reduce the complexity of monitoring, we theoretically develop a lower bound of SVD minimum loss for dynamic networks and use the bound to replace the minimum loss in monitoring. By setting a maximum tolerated error as a threshold, we can trigger SVD restart automatically when the margin exceeds this threshold. We prove that the time complexity of our method is linear with respect to the number of local dynamic changes, and our method is general across different types of dynamic networks. We conduct extensive experiments on several synthetic and real dynamic networks. The experimental results demonstrate that our proposed method significantly outperforms the existing methods by reducing 27% to 42% in terms of the maximum error for dynamic network reconstruction when fixing the number of restarts. Our method reduces the number of restarts by 25% to 50% when fixing the maximum error tolerated.

关键词

全文

PDF


Ranking Users in Social Networks With Higher-Order Structures

摘要

PageRank has been widely used to measure the authority or the influence of a user in social networks. However, conventional PageRank only makes use of edge-based relations, ignoring higher-order structures captured by motifs, subgraphs consisting of a small number of nodes in complex networks. In this paper, we propose a novel framework, motif-based PageRank (MPR), to incorporate higher-order structures into conventional PageRank computation. We conduct extensive experiments in three real-world networks, i.e., DBLP, Epinions, and Ciao, to show that MPR can significantly improve the effectiveness of PageRank for ranking users in social networks. In addition to numerical results, we also provide detailed analysis for MPR to show how and why incorporating higher-order information works better than PageRank in ranking users in social networks.

关键词

Motif-based PageRank; Social Networks; Higher-order;

全文

PDF


Mitigating Overexposure in Viral Marketing

摘要

In traditional models for word-of-mouth recommendations and viral marketing, the objective function has generally been based on reaching as many people as possible. However, a number of studies have shown that the indiscriminate spread of a product by word-of-mouth can result in overexposure, reaching people who evaluate it negatively. This can lead to an effect in which the over-promotion of a product can produce negative reputational effects, by reaching a part of the audience that is not receptive to it. How should one make use of social influence when there is a risk of overexposure? In this paper, we develop and analyze a theoretical model for this process; we show how it captures a number of the qualitative phenomena associated with overexposure, and for the main formulation of our model, we provide a polynomial-time algorithm to find the optimal marketing strategy. We also present simulations of the model on real network topologies, quantifying the extent to which our optimal strategies outperform natural baselines.

关键词

Social Networks, Influence Maximization, Viral Marketing, Game Theory and Economic Paradigms

全文

PDF


Neural Link Prediction over Aligned Networks

摘要

Link prediction is a fundamental problem with a wide range of applications in various domains, which predicts the links that are not yet observed or the links that may appear in the future. Most existing works in this field only focus on modeling a single network, while real-world networks are actually aligned with each other. Network alignments contain valuable additional information for understanding the networks, and provide a new direction for addressing data insufficiency and alleviating cold start problem. However, there are rare works leveraging network alignments for better link prediction. Besides, neural network is widely employed in various domains while its capability of capturing high-level patterns and correlations for link prediction problem has not been adequately researched yet. Hence, in this paper we target atlink prediction over aligned networks using neural networks. The major challenge is the heterogeneousness of the considered networks, as the networks may have different characteristics, link purposes, etc. To overcome this, we propose a novel multi-neural-network framework MNN, where we have one individual neural network for each heterogeneous target or feature while the vertex representations are shared. We further discuss training methods for the multi-neural-network framework. Extensive experiments demonstrate that MNN outperforms the state-of-the-art methods and achieves 3% to 5% relative improvement of AUC score across different settings, particularly over 8% for cold start scenarios.

关键词

Link prediction; Network alignment; Neural network

全文

PDF


Privacy Preserving Point-of-Interest Recommendation Using Decentralized Matrix Factorization

摘要

Points of interest (POI) recommendation has been drawn much attention recently due to the increasing popularity of location-based networks, e.g., Foursquare and Yelp. Among the existing approaches to POI recommendation, Matrix Factorization (MF) based techniques have proven to be effective. However, existing MF approaches suffer from two major problems: (1) Expensive computations and storages due to the centralized model training mechanism: the centralized learners have to maintain the whole user-item rating matrix, and potentially huge low rank matrices. (2) Privacy issues: the users' preferences are at risk of leaking to malicious attackers via the centralized learner. To solve these, we present a Decentralized MF (DMF) framework for POI recommendation. Specifically, instead of maintaining all the low rank matrices and sensitive rating data for training, we propose a random walk based decentralized training technique to train MF models on each user's end, e.g., cell phone and Pad. By doing so, the ratings of each user are still kept on one's own hand, and moreover, decentralized learning can be taken as distributed learning with multi-learners (users), and thus alleviates the computation and storage issue. Experimental results on two real-world datasets demonstrate that, comparing with the classic and state-of-the-art latent factor models, DMF significantly improvements the recommendation performance in terms of precision and recall.

关键词

recommender system, decentralized learning, POI recommendation

全文

PDF


CA-RNN: Using Context-Aligned Recurrent Neural Networks for Modeling Sentence Similarity

摘要

The recurrent neural networks (RNNs) have shown good performance for sentence similarity modeling in recent years. Most RNNs focus on modeling the hidden states based on the current sentence, while the context information from the other sentence is not well investigated during the hidden state generation. In this paper, we propose a context-aligned RNN (CA-RNN) model, which incorporates the contextual information of the aligned words in a sentence pair for the inner hidden state generation. Specifically, we first perform word alignment detection to identify the aligned words in the two sentences. Then, we present a context alignment gating mechanism and embed it into our model to automatically absorb the aligned words' context for the hidden state update. Experiments on three benchmark datasets, namely TREC-QA and WikiQA for answer selection and MSRP for paraphrase identification, show the great advantages of our proposed model. In particular, we achieve the new state-of-the-art performance on TREC-QA and WikiQA. Furthermore, our model is comparable to if not better than the recent neural network based approaches on MSRP.

关键词

context alignment gating; context-aligned recurrent neural networks; sentence similarity modeling

全文

PDF


Dual Deep Neural Networks Cross-Modal Hashing

摘要

Recently, deep hashing methods have attracted much attention in multimedia retrieval task. Some of them can even perform cross-modal retrieval. However, almost all existing deep cross-modal hashing methods are pairwise optimizing methods, which means that they become time-consuming if they are extended to large scale datasets. In this paper, we propose a novel tri-stage deep cross-modal hashing method – Dual Deep Neural Networks Cross-Modal Hashing, i.e., DDCMH, which employs two deep networks to generate hash codes for different modalities. Specifically, in Stage 1, it leverages a single-modal hashing method to generate the initial binary codes of textual modality of training samples; in Stage 2, these binary codes are treated as supervised information to train an image network, which maps visual modality to a binary representation; in Stage 3, the visual modality codes are reconstructed according to a reconstruction procedure, and used as supervised information to train a text network, which generates the binary codes for textual modality. By doing this, DDCMH can make full use of inter-modal information to obtain high quality binary codes, and avoid the problem of pairwise optimization by optimizing different modalities independently. The proposed method can be treated as a framework which can extend any single-modal hashing method to perform cross-modal search task. DDCMH is tested on several benchmark datasets. The results demonstrate that it outperforms both deep and shallow state-of-the-art hashing methods.

关键词

Cross-Modal Retrieval; Hashing; Deep Neural Network

全文

PDF


Representation Learning for Scale-Free Networks

摘要

Network embedding aims to learn the low-dimensional representations of vertexes in a network, while structure and inherent properties of the network is preserved. Existing network embedding works primarily focus on preserving the microscopic structure, such as the first- and second-order proximity of vertexes, while the macroscopic scale-free property is largely ignored. Scale-free property depicts the fact that vertex degrees follow a heavy-tailed distribution (i.e., only a few vertexes have high degrees) and is a critical property of real-world networks, such as social networks. In this paper, we study the problem of learning representations for scale-free networks. We first theoretically analyze the difficulty of embedding and reconstructing a scale-free network in the Euclidean space, by converting our problem to the sphere packing problem. Then, we propose the "degree penalty" principle for designing scale-free property preserving network embedding algorithm: punishing the proximity between high-degree vertexes. We introduce two implementations of our principle by utilizing the spectral techniques and a skip-gram model respectively. Extensive experiments on six datasets show that our algorithms are able to not only reconstruct heavy-tailed distributed degree distribution, but also outperform state-of-the-art embedding models in various network mining tasks, such as vertex classification and link prediction.

关键词

representation learning; scale-free property; social network

全文

PDF


VSE-ens: Visual-Semantic Embeddings with Efficient Negative Sampling

摘要

Jointing visual-semantic embeddings (VSE) have become a research hotpot for the task of image annotation, which suffers from the issue of semantic gap, i.e., the gap between images' visual features (low-level) and labels' semantic features (high-level). This issue will be even more challenging if visual features cannot be retrieved from images, that is, when images are only denoted by numerical IDs as given in some real datasets. The typical way of existing VSE methods is to perform a uniform sampling method for negative examples that violate the ranking order against positive examples, which requires a time-consuming search in the whole label space. In this paper, we propose a fast adaptive negative sampler that can work well in the settings of no figure pixels available. Our sampling strategy is to choose the negative examples that are most likely to meet the requirements of violation according to the latent factors of images. In this way, our approach can linearly scale up to large datasets. The experiments demonstrate that our approach converges 5.02x faster than the state-of-the-art approaches on OpenImages, 2.5x on IAPR-TCI2 and 2.06x on NUS-WIDE datasets, as well as better ranking accuracy across datasets.

关键词

Visual Semantic Embeddings

全文

PDF


Partial Multi-View Outlier Detection Based on Collective Learning

摘要

In the past decade, various multi-view outlier detection methods have been designed to detect horizontal outliers that exhibit inconsistent across-view characteristics. The existing works assume that all objects are present in all views. However, in real-world applications, it is often the incomplete case that every view may suffer from some missing samples, resulting in partial objects difficult to detect outliers from. To address this problem, we propose a novel Collective Learning (CL) based framework to detect outliers from partial multi-view data in a self-guided way. More specifically, by well exploiting the inter-dependence among different views, we develop an algorithm to reconstruct missing samples based on learning. Furthermore, we propose similarity-based outlier detection to break through the dilemma that the number of clusters is unknown priori. Then, the calculated outlier scores act as the confidence levels in CL and in turn guide the reconstruction of missing data. Learning-based missing sample recovery and similarity-based outlier detection are iteratively performed in a self-guided manner. Experimental results on benchmark datasets show that our proposed approach consistently and significantly outperforms state-of-the-art baselines.

关键词

全文

PDF


A Network-Specific Markov Random Field Approach to Community Detection

摘要

Markov Random Field (MRF) is a powerful framework for developing probabilistic models of complex problems. MRF models possess rich structures to represent properties and constraints of a problem. It has been successful on many application problems, particularly those of computer vision and image processing, where data are structured, e.g., pixels are organized on grids. The problem of identifying communities in networks, which is essential for network analysis, is in principle analogous to finding objects in images. It is surprising that MRF has not yet been explored for network community detection. It is challenging to apply MRF to network analysis problems where data are organized on graphs with irregular structures. Here we present a network-specific MRF approach to community detection. The new method effectively encodes the structural properties of an irregular network in an energy function (the core of an MRF model) so that the minimization of the function gives rise to the best community structures. We analyzed the new MRF-based method on several synthetic benchmarks and real-world networks, showing its superior performance over the state-of-the-art methods for community identification.

关键词

Community Detection; Markov Random Field; Belief Propagation

全文

PDF


Robust Detection of Link Communities in Large Social Networks by Exploiting Link Semantics

摘要

Community detection has been extensively studied for various applications, focusing primarily on network topologies. Recent research has started to explore node contents to identify semantically meaningful communities and interpret their structures using selected words. However, links in real networks typically have semantic descriptions, e.g., comments and emails in social media, supporting the notion of communities of links. Indeed, communities of links can better describe multiple roles that nodes may play and provide a richer characterization of community behaviors than communities of nodes. The second issue in community finding is that most existing methods assume network topologies and descriptive contents to be consistent and to carry the compatible information of node group membership, which is generally violated in real networks. These methods are also restricted to interpret one community with one topic. The third problem is that the existing methods have used top ranked words or phrases to label topics when interpreting communities. However, it is often difficult to comprehend the derived topics using words or phrases, which may be irrelevant. To address these issues altogether, we propose a new unified probabilistic model that can be learned by a dual nested expectation-maximization algorithm. Our new method explores the intrinsic correlation between communities and topics to discover link communities robustly and extract adequate community summaries in sentences instead of words for topic labeling at the same time. It is able to derive more than one topical summary per community to provide rich explanations. We present experimental results to show the effectiveness of our new approach, and evaluate the quality of the results by a case study.

关键词

Social Networks; Community Detection; Topical Summary; Link Communities; Probabilistic Model

全文

PDF


On Validation and Predictability of Digital Badges’ Influence on Individual Users

摘要

Badges are a common, and sometimes the only, method of incentivizing users to perform certain actions on on- line sites. However, due to many competing factors influencing user temporal dynamics, it is difficult to determine whether the badge had (or will have) the intended effect or not. In this paper, we introduce two complementary approaches for determining badge influence on users. In the first one, we cluster users’ temporal traces (represented with Poisson processes) and apply covariates (user features) to regularize results. In the second approach, we first classify users’ temporal traces with a novel statistical framework, and then we refine the classification results with a semi-supervised clustering of covariates. Outcomes obtained from an evaluation on synthetic datasets and experiments on two badges from a pop- ular Q&A platform confirm that it is possible to validate, characterize and to some extent predict users affected by the badge.

关键词

badges; point processes; social media

全文

PDF


FILE: A Novel Framework for Predicting Social Status in Signed Networks

摘要

Link prediction in signed social networks is challenging because of the existence and imbalance of the three kinds of social status (positive, negative and no-relation). Furthermore, there are a variety types of no-relation status in reality, e.g., strangers and frenemies, which cannot be well distinguished from the other linked status by existing approaches. In this paper, we propose a novel Framework of Integrating both Latent and Explicit features (FILE), to better deal with the no-relation status and improve the overall link prediction performance in signed networks. In particular, we design two latent features from latent space and two explicit features by extending social theories, and learn these features for each user via matrix factorization with a specially designed ranking-oriented loss function. Experimental results demonstrate the superior of our approach over state-of-the-art methods.

关键词

全文

PDF


Community Detection in Attributed Graphs: An Embedding Approach

摘要

Community detection is a fundamental and widely-studied problem that finds all densely-connected groups of nodes and well separates them from others in graphs. With the proliferation of rich information available for entities in real-world networks, it is useful to discover communities in attributed graphs where nodes tend to have attributes. However, most existing attributed community detection methods directly utilize the original network topology leading to poor results due to ignoring inherent community structures. In this paper, we propose a novel embedding based model to discover communities in attributed graphs. Specifically, based on the observation of densely-connected structures in communities, we develop a novel community structure embedding method to encode inherent community structures via underlying community memberships. Based on node attributes and community structure embedding, we formulate the attributed community detection as a nonnegative matrix factorization optimization problem. Moreover, we carefully design iterative updating rules to make sure of finding a converging solution. Extensive experiments conducted on 19 attributed graph datasets with overlapping and non-overlapping ground-truth communities show that our proposed model CDE can accurately identify attributed communities and significantly outperform 7 state-of-the-art methods.

关键词

social networking and community identification

全文

PDF


Social Recommendation with an Essential Preference Space

摘要

Social recommendation, which aims to exploit social information to improve the quality of a recommender system, has attracted an increasing amount of attention in recent years. A large portion of existing social recommendation models are based on the tractable assumption that users consider the same factors to make decisions in both recommender systems and social networks. However, this assumption is not in concert with real-world situations, since users usually show different preferences in different scenarios. In this paper, we investigate how to exploit the differences between user preference in recommender systems and that in social networks, with the aim to further improve the social recommendation. In particular, we assume that the user preferences in different scenarios are results of different linear combinations from a more underlying user preference space. Based on this assumption, we propose a novel social recommendation framework, called social recommendation with an essential preferences space (SREPS), which simultaneously models the structural information in the social network, the rating and the consumption information in the recommender system under the capture of essential preference space. Experimental results on four real-world datasets demonstrate the superiority of the proposed SREPS model compared with seven state-of-the-art social recommendation methods.

关键词

全文

PDF


Early Detection of Fake News on Social Media Through Propagation Path Classification with Recurrent and Convolutional Networks

摘要

In the midst of today's pervasive influence of social media, automatically detecting fake news is drawing significant attention from both the academic communities and the general public. Existing detection approaches rely on machine learning algorithms with a variety of news characteristics to detect fake news. However, such approaches have a major limitation on detecting fake news early, i.e., the information required for detecting fake news is often unavailable or inadequate at the early stage of news propagation. As a result, the accuracy of early detection of fake news is low. To address this limitation, in this paper, we propose a novel model for early detection of fake news on social media through classifying news propagation paths. We first model the propagation path of each news story as a multivariate time series in which each tuple is a numerical vector representing characteristics of a user who engaged in spreading the news. Then, we build a time series classifier that incorporates both recurrent and convolutional networks which capture the global and local variations of user characteristics along the propagation path respectively, to detect fake news. Experimental results on three real-world datasets demonstrate that our proposed model can detect fake news with accuracy 85% and 92% on Twitter and Sina Weibo respectively in 5 minutes after it starts to spread, which is significantly faster than state-of-the-art baselines.

关键词

fake news detection;social media;information propagation;recurrent and convolutional networks;deep learning

全文

PDF


Cross-Lingual Entity Linking for Web Tables

摘要

This paper studies the problem of linking string mentions from web tables in one language to the corresponding named entities in a knowledge base written in another language, which we call the cross-lingual table linking task. We present a joint statistical model to simultaneously link all mentions that appear in one table. The framework is based on neural networks, aiming to bridge the language gap by vector space transformation and a coherence feature that captures the correlations between entities in one table. Experimental results report that our approach improves the accuracy of cross-lingual table linking by a relative gain of 12.1%. Detailed analysis of our approach also shows a positive and important gain brought by the joint framework and coherence feature.

关键词

entity linking; Web table; cross-lingual

全文

PDF


DepthLGP: Learning Embeddings of Out-of-Sample Nodes in Dynamic Networks

摘要

Network embedding algorithms to date are primarily designed for static networks, where all nodes are known before learning. How to infer embeddings for out-of-sample nodes, i.e. nodes that arrive after learning, remains an open problem. The problem poses great challenges to existing methods, since the inferred embeddings should preserve intricate network properties such as high-order proximity, share similar characteristics (i.e. be of a homogeneous space) with in-sample node embeddings, and be of low computational cost. To overcome these challenges, we propose a Deeply Transformed High-order Laplacian Gaussian Process (DepthLGP) method to infer embeddings for out-of-sample nodes. DepthLGP combines the strength of nonparametric probabilistic modeling and deep learning. In particular, we design a high-order Laplacian Gaussian process (hLGP) to encode network properties, which permits fast and scalable inference. In order to further ensure homogeneity, we then employ a deep neural network to learn a nonlinear transformation from latent states of the hLGP to node embeddings. DepthLGP is general, in that it is applicable to embeddings learned by any network embedding algorithms. We theoretically prove the expressive power of DepthLGP, and conduct extensive experiments on real-world networks. Empirical results demonstrate that our approach can achieve significant performance gain over existing approaches.

关键词

Network Embedding; Gaussian Process; Dynamic Network

全文

PDF


Listening to the World Improves Speech Command Recognition

摘要

We study transfer learning in convolutional network architectures applied to the task of recognizing audio, such as environmental sound events and speech commands. Our key finding is that not only is it possible to transfer representations from an unrelated task like environmental sound classification to a voice-focused task like speech command recognition, but also that doing so improves accuracies significantly. We also investigate the effect of increased model capacity for transfer learning audio, by first validating known results from the field of Computer Vision of achieving better accuracies with increasingly deeper networks on two audio datasets: UrbanSound8k and Google Speech Commands. Then we propose a simple multiscale input representation using dilated convolutions and show that it is able to aggregate larger contexts and increase classification performance. Further, the models trained using a combination of transfer learning and multiscale input representations need only 50% of the training data to achieve similar accuracies as a freshly trained model with 100% of the training data. Finally, we demonstrate a positive interaction effect for the multiscale input and transfer learning, making a case for the joint application of the two techniques.

关键词

deep learning; transfer learning; speech recognition

全文

PDF


Location-Sensitive User Profiling Using Crowdsourced Labels

摘要

In this paper, we investigate the impact of spatial variation on the construction of location-sensitive user profiles. We demonstrate evidence of spatial variation over a collection of Twitter Lists, wherein we find that crowdsourced labels are constrained by distance. For example, that energy in San Francisco is more associated with the green movement, whereas in Houston it is more associated with oil and gas. We propose a three-step framework for location-sensitive user profiling: first, it constructs a crowdsourced label similarity graph, where each labeler and labelee are annotated with a geographic coordinate; second, it transforms this similarity graph into a directed weighted tree that imposes a hierarchical structure over these labels; third, it embeds this location-sensitive folksonomy into a user profile ranking algorithm that outputs a ranked list of candidate labels for a partially observed user profile. Through extensive experiments over a Twitter list dataset, we demonstrate the effectiveness of this location-sensitive user profiling.

关键词

user profiling; location-sensitive;

全文

PDF


Binary Generative Adversarial Networks for Image Retrieval

摘要

The most striking successes in image retrieval using deep hashing have mostly involved discriminative models, which require labels. In this paper, we use binary generative adversarial networks (BGAN) to embed images to binary codes in an unsupervised way. By restricting the input noise variable of generative adversarial networks (GAN) to be binary and conditioned on the features of each input image, BGAN can simultaneously learn a binary representation per image, and generate an image plausibly similar to the original one. In the proposed framework, we address two main problems: 1) how to directly generate binary codes without relaxation? 2) how to equip the binary representation with the ability of accurate image retrieval? We resolve these problems by proposing new sign-activation strategy and a loss function steering the learning process, which consists of new models for adversarial loss, a content loss, and a neighborhood structure loss. Experimental results on standard datasets (CIFAR-10, NUSWIDE, and Flickr) demonstrate that our BGAN significantly outperforms existing hashing methods by up to 107% in terms of mAP (See Table 2).

关键词

hashing; GAN; image retrieval

全文

PDF


Deep Region Hashing for Generic Instance Search from Images

摘要

Instance Search (INS) is a fundamental problem for many applications, while it is more challenging comparing to traditional image search since the relevancy is defined at the instance level. Existing works have demonstrated the success of many complex ensemble systems that are typically conducted by firstly generating object proposals, and then extracting handcrafted and/or CNN features of each proposal for matching. However, object bounding box proposals and feature extraction are often conducted in two separated steps, thus the effectiveness of these methods collapses. Also, due to the large amount of generated proposals, matching speed becomes the bottleneck that limits its application to large-scale datasets. To tackle these issues, in this paper we propose an effective and efficient Deep Region Hashing (DRH) approach for large-scale INS using an image patch as the query. Specifically, DRH is an end-to-end deep neural network which consists of object proposal, feature extraction, and hash code generation. DRH shares full-image convolutional feature map with the region proposal network, thus enabling nearly cost-free region proposals. Also, each high-dimensional, real-valued region features are mapped onto a low-dimensional, compact binary codes for the efficient object region level matching on large-scale dataset. Experimental results on four datasets show that our DRH can achieve even better performance than the state-of-the-arts in terms of mAP, while the efficiency is improved by nearly 100 times.

关键词

hashing; region; image retrieval

全文

PDF


Improved English to Russian Translation by Neural Suffix Prediction

摘要

Neural machine translation (NMT) suffers a performance deficiency when a limited vocabulary fails to cover the source or target side adequately, which happens frequently when dealing with morphologically rich languages. To address this problem, previous work focused on adjusting translation granularity or expanding the vocabulary size. However, morphological information is relatively under-considered in NMT architectures, which may further improve translation quality. We propose a novel method, which can not only reduce data sparsity but also model morphology through a simple but effective mechanism. By predicting the stem and suffix separately during decoding, our system achieves an improvement of up to 1.98 BLEU compared with previous work on English to Russian translation. Our method is orthogonal to different NMT architectures and stably gains improvements on various domains.

关键词

morphology; Russain; machine translation

全文

PDF


Towards Efficient Detection of Overlapping Communities in Massive Networks

摘要

Community detection is essential to analyzing and exploring natural networks such as social networks, biological networks, and citation networks. However, few methods could be used as off-the-shelf tools to detect communities in real world networks for two reasons. On the one hand, most existing methods for community detection cannot handle massive networks that contain millions or even hundreds of millions of nodes. On the other hand, communities in real world networks are generally highly overlapped, requiring that community detection method could capture the mixed community membership. In this paper, we aim to offer an off-the-shelf method to detect overlapping communities in massive real world networks. For this purpose, we take the widely-used Poisson model for overlapping community detection as starting point and design two speedup strategies to achieve high efficiency. Extensive tests on synthetic and large scale real networks demonstrate that the proposed strategies speedup the community detection method based on Poisson model by 1 to 2 orders of magnitudes, while achieving comparable accuracy at community detection.

关键词

全文

PDF


Structural Deep Embedding for Hyper-Networks

摘要

Network embedding has recently attracted lots of attentions in data mining. Existing network embedding methods mainly focus on networks with pairwise relationships. In real world, however, the relationships among data points could go beyond pairwise, i.e., three or more objects are involved in each relationship represented by a hyperedge, thus forming hyper-networks. These hyper-networks pose great challenges to existing network embedding methods when the hyperedges are indecomposable, that is to say, any subset of nodes in a hyperedge cannot form another hyperedge. These indecomposable hyperedges are especially common in heterogeneous networks. In this paper, we propose a novel Deep Hyper-Network Embedding (DHNE) model to embed hyper-networks with indecomposable hyperedges. More specifically, we theoretically prove that any linear similarity metric in embedding space commonly used in existing methods cannot maintain the indecomposibility property in hyper-networks, and thus propose a new deep model to realize a non-linear tuplewise similarity function while preserving both local and global proximities in the formed embedding space. We conduct extensive experiments on four different types of hyper-networks, including a GPS network, an online social network, a drug network and a semantic network. The empirical results demonstrate that our method can significantly and consistently outperform the state-of-the-art algorithms.

关键词

全文

PDF


Confidence-Aware Matrix Factorization for Recommender Systems

摘要

Collaborative filtering (CF), particularly matrix factorization (MF) based methods, have been widely used in recommender systems. The literature has reported that matrix factorization methods often produce superior accuracy of rating prediction in recommender systems. However, existing matrix factorization methods rarely consider confidence of the rating prediction and thus cannot support advanced recommendation tasks. In this paper, we propose a Confidence-aware Matrix Factorization (CMF) framework to simultaneously optimize the accuracy of rating prediction and measure the prediction confidence in the model. Specifically, we introduce variance parameters for both users and items in the matrix factorization process. Then, prediction interval can be computed to measure confidence for each predicted rating. These confidence quantities can be used to enhance the quality of recommendation results based on Confidence-aware Ranking (CR). We also develop two effective implementations of our framework to compute the confidence-aware matrix factorization for large-scale data. Finally, extensive experiments on three real-world datasets demonstrate the effectiveness of our framework from multiple perspectives.

关键词

Recommender systems; Matrix factorization; Confidence; Variance

全文

PDF


Deep Asymmetric Transfer Network for Unbalanced Domain Adaptation

摘要

Recently, domain adaptation based on deep models has been a promising way to deal with the domains with scarce labeled data, which is a critical problem for deep learning models. Domain adaptation propagates the knowledge from a source domain with rich information to the target domain. In reality, the source and target domains are mostly unbalanced in that the source domain is more resource-rich and thus has more reliable knowledge than the target domain. However, existing deep domain adaptation approaches often pre-assume the source and target domains balanced and equally, leading to a medium solution between the source and target domains, which is not optimal for the unbalanced domain adaptation. In this paper, we propose a novel Deep Asymmetric Transfer Network (DATN) to address the problem of unbalanced domain adaptation. Specifically, our model will learn a transfer function from the target domain to the source domain and meanwhile adapting the source domain classifier with more discriminative power to the target domain. By doing this, the deep model is able to adaptively put more emphasis on the resource-rich source domain. To alleviate the scarcity problem of supervised data, we further propose an unsupervised transfer method to propagate the knowledge from a lot of unsupervised data by minimizing the distribution discrepancy over the unlabeled data of two domains. The experiments on two real-world datasets demonstrate that DATN attains a substantial gain over state-of-the-art methods.

关键词

全文

PDF


A Multi-Task Learning Approach for Improving Product Title Compression with User Search Log Data

摘要

It is a challenging and practical research problem to obtain effective compression of lengthy product titles for E-commerce. This is particularly important as more and more users browse mobile E-commerce apps and more merchants make the original product titles redundant and lengthy for Search Engine Optimization. Traditional text summarization approaches often require a large amount of preprocessing costs and do not capture the important issue of conversion rate in E-commerce. This paper proposes a novel multi-task learning approach for improving product title compression with user search log data. In particular, a pointer network-based sequence-to-sequence approach is utilized for title compression with an attentive mechanism as an extractive method and an attentive encoder-decoder approach is utilized for generating user search queries. The encoding parameters (i.e., semantic embedding of original titles) are shared among the two tasks and the attention distributions are jointly optimized. An extensive set of experiments with both human annotated data and online deployment demonstrate the advantage of the proposed research for both compression qualities and online business values.

关键词

Sentence Summarization; Product Title Compression; Multi-task Learning

全文

PDF


Personalized Time-Aware Tag Recommendation

摘要

Personalized tag recommender systems suggest a list of tags to a user when he or she wants to annotate an item. They utilize users’ preferences and the features of items. Tensorfactorization techniques have been widely used in tag recommendation. Given the user-item pair, although the classic PITF (Pairwise Interaction Tensor Factorization) explicitly models the pairwise interactions among users, items and tags, it overlooks users’ short-term interests and suffers from data sparsity. On the other hand, given the user-item-time triple, time-aware approaches like BLL (Base-Level Learning) utilize the time effect to capture the temporal dynamics and the most popular tags on items to handle cold start situation of new users. However, it works only on individual level and the target resource level, which cannot find users’ potential interests. In this paper, we propose an unified tag recommendation approach by considering both time awareness and personalization aspects, which extends PITF by adding weightsto user-tag interaction and item-tag interaction respectively. Compared to PITF, our proposed model can depict temporal factor by temporal weights and relieve data sparsity problem by referencing the most popular tags on items. Further, our model brings collaborative filtering (CF) to time-aware models, which can mine information from global data and help improving the ability of recommending new tags. Different from the power-form functions used in the existing time aware recommendation models, we use the Hawkes process with the exponential intensity function to improve the model’s efficiency. The experimental results show that our proposed model outperforms the state of the art tag recommendation methods in accuracy and has better ability to recommend new tags.

关键词

全文

PDF


Telepath: Understanding Users from a Human Vision Perspective in Large-Scale Recommender Systems

摘要

Designing an e-commerce recommender system that serves hundreds of millions of active users is a daunting challenge. To our best knowledge, the complex brain activity mechanism behind human shopping activities is never considered in existing recommender systems. From a human vision perspective, we found two key factors that affect users’ behaviors: items’ attractiveness and their matching degrees with users’ interests. This paper proposes Telepath, a vision-based bionic recommender system model, which simulates human brain activities in decision making of shopping, thus understanding users from such perspective. The core of Telepath is a complex deep neural network with multiple subnetworks. In practice, the Telepath model has been launched to JD’s recommender system and advertising system and outperformed the former state-of-the-art method. For one of the major item recommendation blocks on the JD app, click-through rate (CTR), gross merchandise value (GMV) and orders have been increased 1.59%, 8.16% and 8.71% respectively by Telepath. For several major ad publishers of JD demand-side platform, CTR, GMV and return on investment have been increased 6.58%, 61.72% and 65.57% respectively by the first launch of Telepath, and further increased 2.95%, 41.75% and 41.37% respectively by the second launch.

关键词

全文

PDF


RSDNE: Exploring Relaxed Similarity and Dissimilarity from Completely-Imbalanced Labels for Network Embedding

摘要

Network embedding, aiming to project a network into a low-dimensional space, is increasingly becoming a focus of network research. Semi-supervised network embedding takes advantage of labeled data, and has shown promising performance. However, existing semi-supervised methods would get unappealing results in the completely-imbalanced label setting where some classes have no labeled nodes at all. To alleviate this, we propose a novel semi-supervised network embedding method, termed Relaxed Similarity and Dissimilarity Network Embedding (RSDNE). Specifically, to benefit from the completely-imbalanced labels, RSDNE guarantees both intra-class similarity and inter-class dissimilarity in an approximate way. Experimental results on several real-world datasets demonstrate the superiority of the proposed method.

关键词

network embedding; social networks; artificial intelligence

全文

PDF


Contrastive Training for Models of Information Cascades

摘要

This paper proposes a model of information cascades as directed spanning trees (DSTs) over observed documents. In addition, we propose a contrastive training procedure that exploits partial temporal ordering of node infections in lieu of labeled training links. This combination of model and unsupervised training makes it possible to improve on models that use infection times alone and to exploit arbitrary features of the nodes and of the text content of messages in information cascades. With only basic node and time lag features similar to previous models, the DST model achieves performance with unsupervised training comparable to strong baselines on a blog network inference task. Unsupervised training with additional content features achieves significantly better results, reaching half the accuracy of a fully supervised model.

关键词

Information Diffusion; Information Cascades; Social Networks; Text Reuse

全文

PDF


Retrieving and Classifying Affective Images via Deep Metric Learning

摘要

Affective image understanding has been extensively studied in the last decade since more and more users express emotion via visual contents. While current algorithms based on convolutional neural networks aim to distinguish emotional categories in a discrete label space, the task is inherently ambiguous. This is mainly because emotional labels with the same polarity (i.e., positive or negative) are highly related, which is different from concrete object concepts such as cat, dog and bird. To the best of our knowledge, few methods focus on leveraging such characteristic of emotions for affective image understanding. In this work, we address the problem of understanding affective images via deep metric learning and propose a multi-task deep framework to optimize both retrieval and classification goals. We propose the sentiment constraints adapted from the triplet constraints, which are able to explore the hierarchical relation of emotion labels. We further exploit the sentiment vector as an effective representation to distinguish affective images utilizing the texture representation derived from convolutional layers. Extensive evaluations on four widely-used affective datasets, i.e., Flickr and Instagram, IAPSa, Art Photo, and Abstract Paintings, demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods on both affective image retrieval and classification tasks.

关键词

全文

PDF


Multi-Facet Network Embedding: Beyond the General Solution of Detection and Representation

摘要

In network analysis, community detection and network embedding are two important topics. Community detection tends to obtain the most noticeable partition, while network embedding aims at seeking node representations which contains as many diverse properties as possible. We observe that the current community detection and network embedding problems are being resolved by a general solution, i.e., "maximizing the consistency between similar nodes while maximizing the distance between the dissimilar nodes." This general solution only exploits the most noticeable structure (facet) of the network, which effectively satisfies the demands of the community detection. Unfortunately, most of the specific embedding algorithms, which are developed from the general solution, cannot achieve the goal of network embedding by exploring only one facet of the network. To improve the general solution for better modeling the real network, we propose a novel network embedding method, Multi-facet Network Embedding (MNE), to capture the multiple facets of the network. MNE learns multiple embeddings simultaneously, with the Hilbert Schmidt Independence Criterion (HSIC) being the a diversity constraint. To efficiently solve the optimization problem, we propose a Binary HSIC with linear complexity and solve the MNE objective function by adopting the Augmented Lagrange Multiplier (ALM) method. The overall complexity is linear with the scale of the network. Extensive results demonstrate that MNE gives efficient performances and outperforms the state-of-the-art network embedding methods.

关键词

全文

PDF


Urban Dreams of Migrants: A Case Study of Migrant Integration in Shanghai

摘要

Unprecedented human mobility has driven the rapid urbanization around the world. In China, the fraction of population dwelling in cities increased from 17.9% to 52.6% between 1978 and 2012. Such large-scale migration poses challenges for policymakers and important questions for researchers. To investigate the process of migrant integration, we employ a one-month complete dataset of telecommunication metadata in Shanghai with 54 million users and 698 million call logs. We find systematic differences between locals and migrants in their mobile communication networks and geographical locations. For instance, migrants have more diverse contacts and move around the city with a larger radius than locals after they settle down. By distinguishing new migrants (who recently moved to Shanghai) from settled migrants (who have been in Shanghai for a while), we demonstrate the integration process of new migrants in their first three weeks. Moreover, we formulate classification problems to predict whether a person is a migrant. Our classifier is able to achieve an F1-score of 0.82 when distinguishing settled migrants from locals, but it remains challenging to identify new migrants because of class imbalance. This classification setup holds promise for identifying new migrants who will successfully integrate into locals (new migrants that misclassified as locals).

关键词

urban migrants; migrant integration; mobile communication networks; geographical locations

全文

PDF


From Common to Special: When Multi-Attribute Learning Meets Personalized Opinions

摘要

Visual attributes, which refer to human-labeled semantic annotations, have gained increasing popularity in a wide range of real world applications. Generally, the existing attribute learning methods fall into two categories: one focuses on learning user-specific labels separately for different attributes, while the other one focuses on learning crowd-sourced global labels jointly for multiple attributes. However, both categories ignore the joint effect of the two mentioned factors: the personal diversity with respect to the global consensus; and the intrinsic correlation among multiple attributes. To overcome this challenge, we propose a novel model to learn user-specific predictors across multiple attributes. In our proposed model, the diversity of personalized opinions and the intrinsic relationship among multiple attributes are unified in a common-to-special manner. To this end, we adopt a three-component decomposition. Specifically, our model integrates a common cognition factor, an attribute-specific bias factor and a user-specific bias factor. Meanwhile Lasso and group Lasso penalties are adopted to leverage efficient feature selection. Furthermore, theoretical analysis is conducted to show that our proposed method could reach reasonable performance. Eventually, the empirical study carried out in this paper demonstrates the effectiveness of our proposed method.

关键词

machine learning; multi-task learning; attribute learning

全文

PDF


Discovering and Distinguishing Multiple Visual Senses for Polysemous Words

摘要

To reduce the dependence on labeled data, there have been increasing research efforts on learning visual classifiers by exploiting web images. One issue that limits their performance is the problem of polysemy. To solve this problem, in this work, we present a novel framework that solves the problem of polysemy by allowing sense-specific diversity in search results. Specifically, we first discover a list of possible semantic senses to retrieve sense-specific images. Then we merge visual similar semantic senses and prune noises by using the retrieved images. Finally, we train a visual classifier for each selected semantic sense and use the learned sense-specific classifiers to distinguish multiple visual senses. Extensive experiments on classifying images into sense-specific categories and re-ranking search results demonstrate the superiority of our proposed approach.

关键词

Visual Polysemy; Multiple Visual Senses; Polysemous Words

全文

PDF


Spatiotemporal Activity Modeling Under Data Scarcity: A Graph-Regularized Cross-Modal Embedding Approach

摘要

Spatiotemporal activity modeling, which aims at modeling users' activities at different locations and time from user behavioral data, is an important task for applications like urban planning and mobile advertising. State-of-the-art methods for this task use cross-modal embedding to map the units from different modalities (location, time, text) into the same latent space. However, the success of such methods relies on data sufficiency, and may not learn quality embeddings when user behavioral data is scarce. To address this problem, we propose BranchNet, a spatiotemporal activity model that transfers knowledge from external sources for alleviating data scarcity. BranchNet adopts a graph-regularized cross-modal embedding framework. At the core of it is a main embedding space, which is shared by the main task of reconstructing user behaviors and the auxiliary graph embedding tasks for external sources, thus allowing external knowledge to guide the cross-modal embedding process. In addition to the main embedding space, the auxiliary tasks also have branched task-specific embedding spaces. The branched embeddings capture the discrepancies between the main task and the auxiliary ones, and free the main embeddings from encoding information for all the tasks. We have empirically evaluated the performance of BranchNet, and found that it is capable of effectively transferring knowledge from external sources to learn better spatiotemporal activity models and outperforming strong baseline methods.

关键词

spatiotemporal data; social media; embedding; graph

全文

PDF


Unsupervised Generative Adversarial Cross-Modal Hashing

摘要

Cross-modal hashing aims to map heterogeneous multimedia data into a common Hamming space, which can realize fast and flexible retrieval across different modalities. Unsupervised cross-modal hashing is more flexible and applicable than supervised methods, since no intensive labeling work is involved. However, existing unsupervised methods learn hashing functions by preserving inter and intra correlations, while ignoring the underlying manifold structure across different modalities, which is extremely helpful to capture meaningful nearest neighbors of different modalities for cross-modal retrieval. To address the above problem, in this paper we propose an Unsupervised Generative Adversarial Cross-modal Hashing approach (UGACH), which makes full use of GAN's ability for unsupervised representation learning to exploit the underlying manifold structure of cross-modal data. The main contributions can be summarized as follows: (1) We propose a generative adversarial network to model cross-modal hashing in an unsupervised fashion. In the proposed UGACH, given a data of one modality, the generative model tries to fit the distribution over the manifold structure, and select informative data of another modality to challenge the discriminative model. The discriminative model learns to distinguish the generated data and the true positive data sampled from correlation graph to achieve better retrieval accuracy. These two models are trained in an adversarial way to improve each other and promote hashing function learning. (2) We propose a correlation graph based approach to capture the underlying manifold structure across different modalities, so that data of different modalities but within the same manifold can have smaller Hamming distance and promote retrieval accuracy. Extensive experiments compared with 6 state-of-the-art methods on 2 widely-used datasets verify the effectiveness of our proposed approach.

关键词

全文

PDF


Exploring Implicit Feedback for Open Domain Conversation Generation

摘要

User feedback can be an effective indicator to the success of the human-robot conversation. However, to avoid to interrupt the online real-time conversation process, explicit feedback is usually gained at the end of a conversation. Alternatively, users' responses usually contain their implicit feedback, such as stance, sentiment, emotion, etc., towards the conversation content or the interlocutors. Therefore, exploring the implicit feedback is a natural way to optimize the conversation generation process. In this paper, we propose a novel reward function which explores the implicit feedback to optimize the future reward of a reinforcement learning based neural conversation model. A simulation strategy is applied to explore the state-action space in training and test. Experimental results show that the proposed approach outperforms the Seq2Seq model and the state-of-the-art reinforcement learning model for conversation generation on automatic and human evaluations on the OpenSubtitles and Twitter datasets.

关键词

Conversation Generation; Implicit Feedback

全文

PDF


Joint Training for Neural Machine Translation Models with Monolingual Data

摘要

Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine translation (SMT) systems and neural machine translation (NMT) systems, especially in resource-poor or domain adaptation tasks where parallel data are not rich enough. In this paper, we propose a novel approach to better leveraging monolingual data for neural machine translation by jointly learning source-to-target and target-to-source NMT models for a language pair with a joint EM optimization method. The training process starts with two initial NMT models pre-trained on parallel data for each direction, and these two models are iteratively updated by incrementally decreasing translation losses on training data.In each iteration step, both NMT models are first used to translate monolingual data from one language to the other, forming pseudo-training data of the other NMT model. Then two new NMT models are learnt from parallel data together with the pseudo training data. Both NMT models are expected to be improved and better pseudo-training data can be generated in next step. Experiment results on Chinese-English and English-German translation tasks show that our approach can simultaneously improve translation quality of source-to-target and target-to-source models, significantly outperforming strong baseline systems which are enhanced with monolingual data for model training including back-translation.

关键词

Neural Machine Translation; Monolingual Data; Joint Training

全文

PDF


Attention-via-Attention Neural Machine Translation

摘要

Since many languages originated from a common ancestral language and influence each other, there would inevitably exist similarities between these languages such as lexical similarity and named entity similarity. In this paper, we leverage these similarities to improve the translation performance in neural machine translation. Specifically, we introduce an attention-via-attention mechanism that allows the information of source-side characters flowing to the target side directly. With this mechanism, the target-side characters will be generated based on the representation of source-side characters when the words are similar. For instance, our proposed neural machine translation system learns to transfer the character-level information of the English word "system" through the attention-via-attention mechanism to generate the Czech word "systém." Consequently, our approach is able to not only achieve a competitive translation performance, but also reduce the model size significantly.

关键词

translation; lexical similarity; character level;

全文

PDF


Dynamic Network Embedding by Modeling Triadic Closure Process

摘要

Network embedding, which aims to learn the low-dimensional representations of vertices, is an important task and has attracted considerable research efforts recently. In real world, networks, like social network and biological networks, are dynamic and evolving over time. However, almost all the existing network embedding methods focus on static networks while ignore network dynamics. In this paper, we present a novel representation learning approach, DynamicTriad, to preserve both structural information and evolution patterns of a given network. The general idea of our approach is to impose triad, which is a group of three vertices and is one of the basic units of networks. In particular, we model how a closed triad, which consists of three vertices connected with each other, develops from an open triad that has two of three vertices not connected with each other. This triadic closure process is a fundamental mechanism in the formation and evolution of networks, thereby makes our model being able to capture the network dynamics and to learn representation vectors for each vertex at different time steps. Experimental results on three real-world networks demonstrate that, compared with several state-of-the-art techniques, DynamicTriad achieves substantial gains in several application scenarios. For example, our approach can effectively be applied and help to identify telephone frauds in a mobile network, and to predict whether a user will repay her loans or not in a loan network.

关键词

Network Embedding; Dynamic Network; Triad Closure Process

全文

PDF


Inferring Emotion from Conversational Voice Data: A Semi-Supervised Multi-Path Generative Neural Network Approach

摘要

To give a more humanized response in Voice Dialogue Applications (VDAs), inferring emotion states from users’ queries may play an important role. However, in VDAs, we have tremendous amount of VDA users and massive scale of unlabeled data with high dimension features from multimodal information, which challenge the traditional speech emotion recognition methods. In this paper, to better infer emotion from conversational voice data, we proposed a semi-supervised multi-path generative neural network. Specifically, first, we build a novel supervised multi-path deep neural network framework. To avoid high dimensional input, raw features are trained by groups in local classifiers. Then high-level features of each local classifiers are concatenated as input of a global classifier. These two kinds classifiers are trained simultaneously through a single objective function to achieve a more effective and discriminative emotion inferring. To further solve the labeled-data-scarcity problem, we extend the multi-path deep neural network to a generative model based on semi-supervised variational autoencoder (semi-VAE), which is able to train the labeled and unlabeled data simultaneously. Experiment based on a 24,000 real-world dataset collected from Sogou Voice Assistant (SVAD13) and a benchmark dataset IEMOCAP show that our method significantly outperforms the existing state-of-the-art results.

关键词

Emotion; variational autoencoder; semi-supervise

全文

PDF


Learning Nonlinear Dynamics in Efficient, Balanced Spiking Networks Using Local Plasticity Rules

摘要

The brain uses spikes in neural circuits to perform many dynamical computations. The computations are performed with properties such as spiking efficiency, i.e. minimal number of spikes, and robustness to noise. A major obstacle for learning computations in artificial spiking neural networks with such desired biological properties is due to lack of our understanding of how biological spiking neural networks learn computations. Here, we consider the credit assignment problem, i.e. determining the local contribution of each synapse to the network's global output error, for learning nonlinear dynamical computations in a spiking network with the desired properties of biological networks. We approach this problem by fusing the theory of efficient, balanced neural networks (EBN) with nonlinear adaptive control theory to propose a local learning rule. Locality of learning rules are ensured by feeding back into the network its own error, resulting in a learning rule depending solely on presynaptic inputs and error feedbacks. The spiking efficiency and robustness of the network are guaranteed by maintaining a tight excitatory/inhibitory balance, ensuring that each spike represents a local projection of the global output error and minimizes a loss function. The resulting networks can learn to implement complex dynamics with very small numbers of neurons and spikes, exhibit the same spike train variability as observed experimentally, and are extremely robust to noise and neuronal loss.

关键词

Learning, Local Plasticity Rules, Spiking Neural Networks, Efficiency, Robustness, Dynamical Systems, Nonlinear Dynamics, E-I Balance

全文

PDF


Perceiving, Learning, and Recognizing 3D Objects: An Approach to Cognitive Service Robots

摘要

There is growing need for robots that can interact with people in everyday situations. For service robots, it is not reasonable to assume that one can pre-program all object categories. Instead, apart from learning from a batch of labelled training data, robots should continuously update and learn new object categories while working in the environment. This paper proposes a cognitive architecture designed to create a concurrent 3D object category learning and recognition in an interactive and open-ended manner. In particular, this cognitive architecture provides automatic perception capabilities that will allow robots to detect objects in highly crowded scenes and learn new object categories from the set of accumulated experiences in an incremental and open-ended way. Moreover, it supports constructing the full model of an unknown object in an on-line manner and predicting next best view for improving object detection and manipulation performance. We provide extensive experimental results demonstrating system performance in terms of recognition, scalability, next-best-view prediction and real-world robotic applications.

关键词

Cognitive Architectures;

全文

PDF


A Unified Model for Document-Based Question Answering Based on Human-Like Reading Strategy

摘要

Document-based Question Answering (DBQA) in Natural Language Processing (NLP) is important but difficult because of the long document and the complex question. Most of previous deep learning methods mainly focus on the similarity computation between two sentences. However, DBQA stems from the reading comprehension in some degree, which is originally used to train and test people's ability of reading and logical thinking. Inspired by the strategy of doing reading comprehension tests, we propose a unified model based on the human-like reading strategy. The unified model contains three major encoding layers that are consistent to different steps of the reading strategy, including the basic encoder, combined encoder and hierarchical encoder. We conduct extensive experiments on both the English WikiQA dataset and the Chinese dataset, and the experimental results show that our unified model is effective and yields state-of-the-art results on WikiQA dataset.

关键词

Reading Strategy; Deep Learning

全文

PDF


Thinking in PolAR Pictures: Using Rotation-Friendly Mental Images to Solve Leiter-R Form Completion

摘要

The Leiter International Performance Scale-Revised (Leiter-R) is a standardized cognitive test that seeks to "provide a nonverbal measure of general intelligence by sampling a wide variety of functions from memory to nonverbal reasoning." Understanding the computational building blocks of nonverbal cognition, as measured by the Leiter-R, is an important step towards understanding human nonverbal cognition, especially with respect to typical and atypical trajectories of child development. One subtest of the Leiter-R, Form Completion, involves synthesizing and localizing a visual figure from its constituent slices. Form Completion poses an interesting nonverbal problem that seems to combine several aspects of visual memory, mental rotation, and visual search. We describe a new computational cognitive model that addresses Form Completion using a novel, mental-rotation-friendly image representation that we call the Polar Augmented Resolution (PolAR) Picture, which enables high-fidelity mental rotation operations. We present preliminary results using actual Leiter-R test items and discuss directions for future work.

关键词

mental rotation; neuropsychological assessment; visual mental imagery.

全文

PDF


A Plasticity-Centric Approach to Train the Non-Differential Spiking Neural Networks

摘要

Many efforts have been taken to train spiking neural networks (SNNs), but most of them still need improvements due to the discontinuous and non-differential characteristics of SNNs. While the mammalian brains solve these kinds of problems by integrating a series of biological plasticity learning rules. In this paper, we will focus on two biological plausible methodologies and try to solve these catastrophic training problems in SNNs. Firstly, the biological neural network will try to keep a balance between inputs and outputs on both the neuron and the network levels. Secondly, the biological synaptic weights will be passively updated by the changes of the membrane potentials of the neighbour-hood neurons, and the plasticity of synapses will not propagate back to other previous layers. With these biological inspirations, we propose Voltage-driven Plasticity-centric SNN (VPSNN), which includes four steps, namely: feed forward inference, unsupervised equilibrium state learning, supervised last layer learning and passively updating synaptic weights based on spike-timing dependent plasticity (STDP). Finally we get the accuracy of 98.52% on the hand-written digits classification task on MNIST. In addition, with the help of a visualization tool, we try to analyze the black box of SNN and get better understanding of what benefits have been acquired by the proposed method.

关键词

Spiking neural network; Unsupervised learning; Supervised learning

全文

PDF


Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

摘要

Many vision and language tasks require commonsense reasoning beyond data-driven image and natural language processing. Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image. Current state-of-the-art systems attempted to solve the task using deep neural architectures and achieved promising performance. However, the resulting systems are generally opaque and they struggle in understanding questions for which extra knowledge is required. In this paper, we present an explicit reasoning layer on top of a set of penultimate neural network based systems. The reasoning layer enables reasoning and answering questions where additional knowledge is required, and at the same time provides an interpretable interface to the end users. Specifically, the reasoning layer adopts a Probabilistic Soft Logic (PSL) based engine to reason over a basket of inputs: visual relations, the semantic parse of the question, and background ontological knowledge from word2vec and ConceptNet. Experimental analysis of the answers and the key evidential predicates generated on the VQA dataset validate our approach.

关键词

vision, dense captioning, reasoning, probabilistic logic

全文

PDF


Action Recognition From Skeleton Data via Analogical Generalization Over Qualitative Representations

摘要

Human action recognition remains a difficult problem for AI. Traditional machine learning techniques can have high recognition accuracy, but they are typically black boxes whose internal models are not inspectable and whose results are not explainable. This paper describes a new pipeline for recognizing human actions from skeleton data via analogical generalization. Specifically, starting with Kinect data, we segment each human action by temporal regions where the motion is qualitatively uniform, creating a sketch graph that provides a form of qualitative representation of the behavior that is easy to visualize. Models are learned from sketch graphs via analogical generalization, which are then used for classification via analogical retrieval. The retrieval process also produces links between the new example and components of the model that provide explanations. To improve recognition accuracy, we implement dynamic feature selection to pick reasonable relational features. We show the explanation advantage of our approach by example, and results on three public datasets illustrate its utility.

关键词

Cognitive system, Qualitative reasoning, Action recognition

全文

PDF


Glass-Box Program Synthesis: A Machine Learning Approach

摘要

Recently proposed models which learn to write computer programs from data use either input/output examples or rich execution traces. Instead, we argue that a novel alternative is to use a glass-box scoring function, given as a program itself that can be directly inspected. Glass-box optimization covers a wide range of problems, from computing the greatest common divisor of two integers, to learning-to-learn problems. In this paper, we present an intelligent search system which learns, given the partial program and the glass-box problem, the probabilities over the space of programs. We empirically demonstrate that our informed search procedure leads to significant improvements compared to brute-force program search, both in terms of accuracy and time. For our experiments we use rich context free grammars inspired by number theory, text processing, and algebra. Our results show that (i) running our framework iteratively can considerably increase the number of problems solved, (ii) our framework can improve itself even in domain agnostic scenarios, and (iii) it can solve problems that would be otherwise too slow to solve with brute-force search.

关键词

Program Synthesis; Machine Learning; Problem Solving

全文

PDF


Learning From Unannotated QA Pairs to Analogically Disambiguate and Answer Questions

摘要

Creating systems that can learn to answer natural language questions has been a longstanding challenge for artificial intelligence. Most prior approaches focused on producing a specialized language system for a particular domain and dataset, and they required training on a large corpus manually annotated with logical forms. This paper introduces an analogy-based approach that instead adapts an existing general purpose semantic parser to answer questions in a novel domain by jointly learning disambiguation heuristics and query construction templates from purely textual question-answer pairs. Our technique uses possible semantic interpretations of the natural language questions and answers to constrain a query-generation procedure, producing cases during training that are subsequently reused via analogical retrieval and composed to answer test questions. Bootstrapping an existing semantic parser in this way significantly reduces the number of training examples needed to accurately answer questions. We demonstrate the efficacy of our technique using the Geoquery corpus, on which it approaches state of the art performance using 10-fold cross validation, shows little decrease in performance with 2-folds, and achieves above 50% accuracy with as few as 10 examples.

关键词

Artificial Intelligence; Question-Answering

全文

PDF


Style Transfer in Text: Exploration and Evaluation

摘要

The ability to transfer styles of texts or images, is an important measurement of the advancement of artificial intelligence (AI). However, the progress in language style transfer is lagged behind other domains, such as computer vision, mainly because of the lack of parallel data and reliable evaluation metrics. In response to the challenge of lacking parallel data, we explore learning style transfer from non-parallel data. We propose two models to achieve this goal. The key idea behind the proposed models is to learn separate content representations and style representations using adversarial networks. Considering the problem of lacking principle evaluation metrics, we propose two novel evaluation metrics that measure two aspects of style transfer: transfer strength and content preservation. We benchmark our models and the evaluation metrics on two style transfer tasks: paper-news title transfer, and positive-negative review transfer. Results show that the proposed content preservation metric is highly correlate to human judgments, and the proposed models are able to generate sentences with similar content preservation score but higher style transfer strength comparing to auto-encoder.

关键词

全文

PDF


HAN: Hierarchical Association Network for Computing Semantic Relatedness

摘要

Measuring semantic relatedness between two words is a significant problem in many areas such as natural language processing. Existing approaches to the semantic relatedness problem mainly adopt the co-occurrence principle and regard two words as highly related if they appear in the same sentence frequently. However, such solutions suffer from low coverage and low precision because i) the two highly related words may not appear close to each other in the sentences, e.g., the synonyms; and ii) the co-occurrence of words may happen by chance rather than implying the closeness in their semantics. In this paper, we explore the latent semantics (i.e., concepts) of the words to identify highly related word pairs. We propose a hierarchical association network to specify the complex relationships among the words and the concepts, and quantify each relationship with appropriate measurements. Extensive experiments are conducted on real datasets and the results show that our proposed method improves correlation precision compared with the state-of-the-art approaches.

关键词

Semantic relatedness; Hierarchical network; Concept relatedness

全文

PDF


Maximizing Activity in Ising Networks via the TAP Approximation

摘要

A wide array of complex biological, social, and physical systems have recently been shown to be quantitatively described by Ising models, which lie at the intersection of statistical physics and machine learning. Here, we study the fundamental question of how to optimize the state of a networked Ising system given a budget of external influence. In the continuous setting where one can tune the influence applied to each node, we propose a series of approximate gradient ascent algorithms based on the Plefka expansion, which generalizes the naive mean field and TAP approximations. In the discrete setting where one chooses a small set of influential nodes, the problem is equivalent to the famous influence maximization problem in social networks with an additional stochastic noise term. In this case, we provide sufficient conditions for when the objective is submodular, allowing a greedy algorithm to achieve an approximation ratio of 1-1/e. Additionally, we compare the Ising-based algorithms with traditional influence maximization algorithms, demonstrating the practical importance of accurately modeling stochastic fluctuations in the system.

关键词

control; maximum entropy models; influence maximization

全文

PDF


Expected Utility with Relative Loss Reduction: A Unifying Decision Model for Resolving Four Well-Known Paradoxes

摘要

Some well-known paradoxes in decision making (e.g., the Allais paradox, the St. Peterburg paradox, the Ellsberg paradox, and the Machina paradox) reveal that choices conventional expected utility theory predicts could be inconsistent with empirical observations. So, solutions to these paradoxes can help us better understand humans decision making accurately. This is also highly related to the prediction power of a decision-making model in real-world applications. Thus, various models have been proposed to address these paradoxes. However, most of them can only solve parts of the paradoxes, and for doing so some of them have to rely on the parameter tuning without proper justifications for such bounds of parameters. To this end, this paper proposes a new descriptive decision-making model, expected utility with relative loss reduction, which can exhibit the same qualitative behaviours as those observed in experiments of these paradoxes without any additional parameter setting. In particular, we introduce the concept of relative loss reduction to reflect people's tendency to prefer ensuring a sufficient minimum loss to just a maximum expected utility in decision-making under risk or ambiguity.

关键词

decision making; expected utility with relative loss reduction; Allais paradox; Ellsberg paradox; St. Petersburg paradox; Machina paradox

全文

PDF


Towards Building Large Scale Multimodal Domain-Aware Conversation Systems

摘要

While multimodal conversation agents are gaining importance in several domains such as retail, travel etc., deep learning research in this area has been limited primarily due to the lack of availability of large-scale, open chatlogs. To overcome this bottleneck, in this paper we introduce the task of multimodal, domain-aware conversations, and propose the MMD benchmark dataset. This dataset was gathered by working in close coordination with large number of domain experts in the retail domain. These experts suggested various conversations flows and dialog states which are typically seen in multimodal conversations in the fashion domain. Keeping these flows and states in mind, we created a dataset consisting of over 150K conversation sessions between shoppers and sales agents, with the help of in-house annotators using a semi-automated manually intense iterative process. With this dataset, we propose 5 new sub-tasks for multimodal conversations along with their evaluation methodology. We also propose two multimodal neural models in the encode-attend-decode paradigm and demonstrate their performance on two of the sub-tasks, namely text response generation and best image response selection. These experiments serve to establish baseline performance and open new research directions for each of these sub-tasks. Further, for each of the sub-tasks, we present a 'per-state evaluation' of 9 most significant dialog states, which would enable more focused research into understanding the challenges and complexities involved in each of these states.

关键词

Multimodal Conversation/QA; Domain-Aware Conversation Systems

全文

PDF


Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph

摘要

While conversing with chatbots, humans typically tend to ask many questions, a significant portion of which can be answered by referring to large-scale knowledge graphs (KG). While Question Answering (QA) and dialog systems have been studied independently, there is a need to study them closely to evaluate such real-world scenarios faced by bots involving both these tasks. Towards this end, we introduce the task of Complex Sequential QA which combines the two tasks of (i) answering factual questions through complex inferencing over a realistic-sized KG of millions of entities, and (ii) learning to converse through a series of coherently linked QA pairs. Through a labor intensive semi-automatic process, involving in-house and crowdsourced workers, we created a dataset containing around 200K dialogs with a total of 1.6M turns. Further, unlike existing large scale QA datasets which contain simple questions that can be answered from a single tuple, the questions in our dialogs require a larger subgraph of the KG. Specifically, our dataset has questions which require logical, quantitative, and comparative reasoning as well as their combinations. This calls for models which can: (i) parse complex natural language questions, (ii) use conversation context to resolve coreferences and ellipsis in utterances, (iii) ask for clarifications for ambiguous queries, and finally (iv) retrieve relevant subgraphs of the KG to answer such questions. However, our experiments with a combination of state of the art dialog and QA models show that they clearly do not achieve the above objectives and are inadequate for dealing with such complex real world settings. We believe that this new dataset coupled with the limitations of existing models as reported in this paper should encourage further research in Complex Sequential QA.

关键词

Complex QA; Sequential QA; KB based QA

全文

PDF


The Structural Affinity Method for Solving the Raven's Progressive Matrices Test for Intelligence

摘要

Graphical models offer techniques for capturing the structure of many problems in real-world domains and provide means for representation, interpretation, and inference. The modeling framework provides tools for discovering rules for solving problems by exploring structural relationships. We present the Structural Affinity method that uses graphical models for first learning and subsequently recognizing the pattern for solving problems on the Raven's Progressive Matrices Test of general human intelligence. Recently there has been considerable work on computational models of addressing the Raven's test using various representations ranging from fractals to symbolic structures. In contrast, our method uses Markov Random Fields parameterized by affinity factors to discover the structure in the geometric analogy problems and induce the rules of Carpenter et al.'s cognitive model of problem-solving on the Raven's Progressive Matrices Test. We provide a computational account that first learns the structure of a Raven's problem and then predicts the solution by computing the probability of the correct answer by recognizing patterns corresponding to Carpenter et al.'s rules. We demonstrate that the performance of our model on the Standard Raven Progressive Matrices is comparable with existing state of the art models.

关键词

Raven's Progressive Matrices; Markov Random Fields; Heuristic Reasoning; Visual Reasoning

全文

PDF


RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

摘要

Open-domain human-computer conversation has been attracting increasing attention over the past few years. However, there does not exist a standard automatic evaluation metric for open-domain dialog systems; researchers usually resort to human annotation for model evaluation, which is time- and labor-intensive. In this paper, we propose RUBER, a Referenced metric and Unreferenced metric Blended Evaluation Routine, which evaluates a reply by taking into consideration both a groundtruth reply and a query (previous user-issued utterance). Our metric is learnable, but its training does not require labels of human satisfaction. Hence, RUBER is flexible and extensible to different datasets and languages. Experiments on both retrieval and generative dialog systems show that RUBER has a high correlation with human annotation, and that RUBER has fair transferability over different datasets.

关键词

Natural Language Understanding; Dialog System

全文

PDF


Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory

摘要

Perception and expression of emotion are key factors to the success of dialogue systems or conversational agents. However, this problem has not been studied in large-scale conversation generation so far. In this paper, we propose Emotional Chatting Machine (ECM) that can generate appropriate responses not only in content (relevant and grammatical) but also in emotion (emotionally consistent). To the best of our knowledge, this is the first work that addresses the emotion factor in large-scale conversation generation. ECM addresses the factor using three new mechanisms that respectively (1) models the high-level abstraction of emotion expressions by embedding emotion categories, (2) captures the change of implicit internal emotion states, and (3) uses explicit emotion expressions with an external emotion vocabulary. Experiments show that the proposed model can generate responses appropriate not only in content but also in emotion.

关键词

Dialogue; Emotion

全文

PDF


Transferring Decomposed Tensors for Scalable Energy Breakdown Across Regions

摘要

Homes constitute roughly one-third of the total energy usage worldwide. Providing an energy breakdown – energy consumption per appliance, can help save up to 15% energy. Given the vast differences in energy consumption patterns across different regions, existing energy breakdown solutions require instrumentation and model training for each geographical region, which is prohibitively expensive and limits the scalability. In this paper, we propose a novel region independent energy breakdown model via statistical transfer learning. Our key intuition is that the heterogeneity in homes and weather across different regions most significantly impacts the energy consumption across regions; and if we can factor out such heterogeneity, we can learn region independent models or the homogeneous energy breakdown components for each individual appliance. Thus, the model learnt in one region can be transferred to another region. We evaluate our approach on two U.S. cities having distinct weather from a publicly available dataset. We find that our approach gives better energy breakdown estimates requiring the least amount of instrumented homes from the target region, when compared to the state-of-the-art.

关键词

全文

PDF


Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Networks

摘要

Cascades represent rapid changes in networks. A cascading phenomenon of ecological and economic impact is the spread of invasive species in geographic landscapes. The most promising management strategy is often biocontrol, which entails introducing a natural predator able to control the invading population, a setting that can be treated as two interacting cascades of predator and prey populations. We formulate and study a nonlinear problem of optimal biocontrol: optimally seeding the predator cascade over time to minimize the harmful prey population. Recurring budgets, which typically face conservation organizations, naturally leads to sparse constraints which make the problem amenable to approximation algorithms. Available methods based on continuous relaxations scale poorly, to remedy this we develop a novel and scalable randomized algorithm based on a width relaxation, applicable to a broad class of combinatorial optimization problems. We evaluate our contributions in the context of biocontrol for the insect pest Hemlock Wolly Adelgid (HWA) in eastern North America. Our algorithm outperforms competing methods in terms of scalability and solution quality and finds near-optimal strategies for the control of the HWA for fine-grained networks -- an important problem in computational sustainability.

关键词

全文

PDF


DyETC: Dynamic Electronic Toll Collection for Traffic Congestion Alleviation

摘要

To alleviate traffic congestion in urban areas, electronic toll collection (ETC) systems are deployed all over the world. Despite the merits, tolls are usually pre-determined and fixed from day to day, which fail to consider traffic dynamics and thus have limited regulation effect when traffic conditions are abnormal. In this paper, we propose a novel dynamic ETC (DyETC) scheme which adjusts tolls to traffic conditions in realtime. The DyETC problem is formulated as a Markov decision process (MDP), the solution of which is very challenging due to its 1) multi-dimensional state space, 2) multi-dimensional, continuous and bounded action space, and 3) time-dependent state and action values. Due to the complexity of the formulated MDP, existing methods cannot be applied to our problem. Therefore, we develop a novel algorithm, PG-beta, which makes three improvements to traditional policy gradient method by proposing 1) time-dependent value and policy functions, 2) Beta distribution policy function and 3) state abstraction. Experimental results show that, compared with existing ETC schemes, DyETC increases traffic volume by around 8%, and reduces travel time by around 14:6% during rush hour. Considering the total traffic volume in a traffic network, this contributes to a substantial increase to social welfare.

关键词

Dynamic road pricing; Sequential planning; Policy gradient

全文

PDF


Cellular Network Traffic Scheduling With Deep Reinforcement Learning

摘要

Modern mobile networks are facing unprecedented growth in demand due to a new class of traffic from Internet of Things (IoT) devices such as smart wearables and autonomous cars. Future networks must schedule delay-tolerant software updates, data backup, and other transfers from IoT devices while maintaining strict service guarantees for conventional real-time applications such as voice-calling and video. This problem is extremely challenging because conventional traffic is highly dynamic across space and time, so its performance is significantly impacted if all IoT traffic is scheduled immediately when it originates. In this paper, we present a reinforcement learning (RL) based scheduler that can dynamically adapt to traffic variation, and to various reward functions set by network operators, to optimally schedule IoT traffic. Using 4 weeks of real network data from downtown Melbourne, Australia spanning diverse traffic patterns, we demonstrate that our RL scheduler can enable mobile networks to carry 14.7% more data with minimal impact on existing traffic, and outpeforms heuristic schedulers by more than 2x. Our work is a valuable step towards designing autonomous, "self-driving" networks that learn to manage themselves from past data.

关键词

Reinforcement Learning, Time-series/Data Streams

全文

PDF


Dispatch Guided Allocation Optimization for Effective Emergency Response

摘要

Plant-pollinator interaction networks are bipartite networks representing the mutualistic interactions between a set of plant species and a set of pollinator species. Data on these networks are collected by field biologists, who count visits from pollinators to flowers. Ecologists study the structure and function of these networks for scientific, conservation, and agricultural purposes. However, little research has been done to understand the underlying mechanisms that determine pairwise interactions or to predict new links from networks describing the species community. This paper explores the use of latent factor models to predict interactions that will occur in new contexts (e.g. a different distribution of the set of plant species) based on an observed network. The analysis draws on algorithms and evaluation strategies developed for recommendation systems and introduces them to this new domain. The matrix factorization methods compare favorably against several baselines on a pollination dataset collected in montane meadows over several years. Incorporating both positive and negative implicit feedback into the matrix factorization methods is particularly promising.

关键词

Emergency response; Constraint optimisation; Heuristics; Data-driven modelling

全文

PDF


DeepUrbanMomentum: An Online Deep-Learning System for Short-Term Urban Mobility Prediction

摘要

Big human mobility data are being continuously generated through a variety of sources, some of which can be treated and used as streaming data for understanding and predicting urban dynamics. With such streaming mobility data, the online prediction of short-term human mobility at the city level can be of great significance for transportation scheduling, urban regulation, and emergency management. In particular, when big rare events or disasters happen, such as large earthquakes or severe traffic accidents, people change their behaviors from their routine activities. This means people's movements will almost be uncorrelated with their past movements. Therefore, in this study, we build an online system called DeepUrbanMomentum to conduct the next short-term mobility predictions by using (the limited steps of) currently observed human mobility data. A deep-learning architecture built with recurrent neural networks is designed to effectively model these highly complex sequential data for a huge urban area. Experimental results demonstrate the superior performance of our proposed model as compared to the existing approaches. Lastly, we apply our system to a real emergency scenario and demonstrate that our system is applicable in the real world.

关键词

Information systems applications; Human mobility

全文

PDF


Variational BOLT: Approximate Learning in Factorial Hidden Markov Models With Application to Energy Disaggregation

摘要

The learning problem for Factorial Hidden Markov Models with discrete and multi-variate latent variables remains a challenge. Inference of the latent variables required for the E-step of Expectation Minimization algorithms is usually computationally intractable. In this paper we propose a variational learning algorithm mimicking the Baum-Welch algorithm. By approximating the filtering distribution with a variational distribution parameterized by a recurrent neural network, the computational complexity of the learning problem as a function of the number of hidden states can be reduced to quasilinear instead of quadratic time as required by traditional algorithms such as Baum-Welch whilst making minimal independence assumptions. We evaluate the performance of the resulting algorithm, which we call Variational BOLT, in the context of unsupervised end-to-end energy disaggregation. We conduct experiments on the publicly available REDD dataset and show competitive results when compared with a supervised inference approach and state-of-the-art results in an unsupervised setting.

关键词

energy disaggregation; factorial hidden markov models; variational inference; neural networks

全文

PDF


Group Sparse Bayesian Learning for Active Surveillance on Epidemic Dynamics

摘要

Predicting epidemic dynamics is of great value in understanding and controlling diffusion processes, such as infectious disease spread and information propagation. This task is intractable, especially when surveillance resources are very limited. To address the challenge, we study the problem of active surveillance, i.e., how to identify a small portion of system components as sentinels to effect monitoring, such that the epidemic dynamics of an entire system can be readily predicted from the partial data collected by such sentinels. We propose a novel measure, the gamma value, to identify the sentinels by modeling a sentinel network with row sparsity structure. We design a flexible group sparse Bayesian learning algorithm to mine the sentinel network suitable for handling both linear and non-linear dynamical systems by using the expectation maximization method and variational approximation. The efficacy of the proposed algorithm is theoretically analyzed and empirically validated using both synthetic and real-world data.

关键词

Surveillance; Epidemic dynamics; Group sparse; Bayesian Learning; Diffusion

全文

PDF


Predicting Links in Plant-Pollinator Interaction Networks Using Latent Factor Models With Implicit Feedback

摘要

Plant-pollinator interaction networks are bipartite networks representing the mutualistic interactions between a set of plant species and a set of pollinator species. Data on these networks are collected by field biologists, who count visits from pollinators to flowers. Ecologists study the structure and function of these networks for scientific, conservation, and agricultural purposes. However, little research has been done to understand the underlying mechanisms that determine pairwise interactions or to predict new links from networks describing the species community. This paper explores the use of latent factor models to predict interactions that will occur in new contexts (e.g. a different distribution of the set of plant species) based on an observed network. The analysis draws on algorithms and evaluation strategies developed for recommendation systems and introduces them to this new domain. The matrix factorization methods compare favorably against several baselines on a pollination dataset collected in montane meadows over several years. Incorporating both positive and negative implicit feedback into the matrix factorization methods is particularly promising.

关键词

Plant-Pollinator Interaction Networks; Link Prediction; Latent Factor Models; Matrix Factorization; Implicit Feedback

全文

PDF


Computation Error Analysis of Block Floating Point Arithmetic Oriented Convolution Neural Network Accelerator Design

摘要

The heavy burdens of computation and off-chip traffic impede deploying the large scale convolution neural network on embedded platforms. As CNN is attributed to the strong endurance to computation errors, employing block floating point (BFP) arithmetics in CNN accelerators could save the hardware cost and data traffics efficiently, while maintaining the classification accuracy. In this paper, we verify the effects of word width definitions in BFP to the CNN performance without retraining. Several typical CNN models, including VGG16, ResNet-18, ResNet-50 and GoogLeNet, were tested in this paper. Experiments revealed that 8-bit mantissa, including sign bit, in BFP representation merely induced less than 0.3% accuracy loss. In addition, we investigate the computational errors in theory and develop the noise-to-signal ratio (NSR) upper bound, which provides the promising guidance for BFP based CNN engine design.

关键词

Error Analysis; CNN; CNN Accelerator; FPGA

全文

PDF


Multi-Entity Dependence Learning With Rich Context via Conditional Variational Auto-Encoder

摘要

Multi-Entity Dependence Learning (MEDL) explores conditional correlations among multiple entities. The availability of rich contextual information requires a nimble learning scheme that tightly integrates with deep neural networks and has the ability to capture correlation structures among exponentially many outcomes. We propose MEDL_CVAE, which encodes a conditional multivariate distribution as a generating process. As a result, the variational lower bound of the joint likelihood can be optimized via a conditional variational auto-encoder and trained end-to-end on GPUs. Our MEDL_CVAE was motivated by two real-world applications in computational sustainability: one studies the spatial correlation among multiple bird species using the eBird data and the other models multi-dimensional landscape composition and human footprint in the Amazon rainforest with satellite images. We show that MEDL_CVAE captures rich dependency structures, scales better than previous methods, and further improves on the joint likelihood taking advantage of very large datasets that are beyond the capacity of previous methods.

关键词

全文

PDF


Optimal Spot-Checking for Improving Evaluation Accuracy of Peer Grading Systems

摘要

Peer grading, allowing students/peers to evaluate others' assignments, offers a promising solution for scaling evaluation and learning to large-scale educational systems. A key challenge in peer grading is motivating peers to grade diligently. While existing spot-checking (SC) mechanisms can prevent peer collusion where peers coordinate to report the uninformative grade, they unrealistically assume that peers have the same grading reliability and cost. This paper studies the general Optimal Spot-Checking (OptSC) problem of determining the probability each assignment needs to be checked to maximize assignments' evaluation accuracy aggregated from peers, and takes into consideration 1) peers' heterogeneous characteristics, and 2) peers' strategic grading behaviors to maximize their own utility. We prove that the bilevel OptSC is NP-hard to solve. By exploiting peers' grading behaviors, we first formulate a single level relaxation to approximate OptSC. By further exploiting structural properties of the relaxed problem, we propose an efficient algorithm to that relaxation, which also gives a good approximation of the original OptSC. Extensive experiments on both synthetic and real datasets show significant advantages of the proposed algorithm over existing approaches.

关键词

Aritificial Intelligence;Multiagent Systems; Game-Theory; Optimization

全文

PDF


Preventing Infectious Disease in Dynamic Populations Under Uncertainty

摘要

Treatable infectious diseases are a critical challenge for public health. Outreach campaigns can encourage undiagnosed patients to seek treatment but must be carefully targeted to make the most efficient use of limited resources. We present an algorithm to optimally allocate limited outreach resources among demographic groups in the population. The algorithm uses a novel multiagent model of disease spread which both captures the underlying population dynamics and is amenable to optimization. Our algorithm extends, with provable guarantees, to a stochastic setting where we have only a distribution over parameters such as the contact pattern between agents. We evaluate our algorithm on two instances where this distribution is inferred from real world data: tuberculosis in India and gonorrhea in the United States. Our algorithm produces a policy which is predicted to avert an average of least 8,000 person-years of tuberculosis and 20,000 person-years of gonorrhea annually compared to current policy.

关键词

Infectious disease; submodularity; stochastic optimization

全文

PDF


Efficiently Approximating the Pareto Frontier: Hydropower Dam Placement in the Amazon Basin

摘要

Real-world problems are often not fully characterized by a single optimal solution, as they frequently involve multiple competing objectives; it is therefore important to identify the so-called Pareto frontier, which captures solution trade-offs. We propose a fully polynomial-time approximation scheme based on Dynamic Programming (DP) for computing a polynomially succinct curve that approximates the Pareto frontier to within an arbitrarily small epsilon > 0 on tree-structured networks. Given a set of objectives, our approximation scheme runs in time polynomial in the size of the instance and 1/epsilon. We also propose a Mixed Integer Programming (MIP) scheme to approximate the Pareto frontier. The DP and MIP Pareto frontier approaches have complementary strengths and are surprisingly effective. We provide empirical results showing that our methods outperform other approaches in efficiency and accuracy. Our work is motivated by a problem in computational sustainability concerning the proliferation of hydropower dams throughout the Amazon basin. Our goal is to support decision-makers in evaluating impacted ecosystem services on the full scale of the Amazon basin. Our work is general and can be applied to approximate the Pareto frontier of a variety of multiobjective problems on tree-structured networks.

关键词

Pareto frontier; approximation algorithms; exact algorithms

全文

PDF


Minesweeper with Limited Moves

摘要

We consider the problem of playing Minesweeper with a limited number of moves: Given a partially revealed board, a number of available clicks k, and a target probability p, can we win with probability p. We win if we do not click on a mine, and, after our sequence of at most k clicks (which reveal information about the neighboring squares) can correctly identify the placement of all mines. We make the assumption, that, at all times, all placements of mines consistent with the currently revealed squares are equiprobable. Our main results are that the problem is PSPACE-complete, and it remains PSPACE-complete when p is a constant, in particular when p = 1. When k = 0 (i.e., we are not allowed to click anywhere), the problem is PP-complete in general, but co-NP-complete when p is a constant, and in particular when p = 1.

关键词

Complexity; Minesweeper; Planning; PSPACE

全文

PDF


Event Representations for Automated Story Generation with Deep Neural Nets

摘要

Automated story generation is the problem of automatically selecting a sequence of events, actions, or words that can be told as a story. We seek to develop a system that can generate stories by learning everything it needs to know from textual story corpora. To date, recurrent neural networks that learn language models at character, word, or sentence levels have had little success generating coherent stories. We explore the question of event representations that provide a mid-level of abstraction between words and sentences in order to retain the semantic information of the original data while minimizing event sparsity. We present a technique for preprocessing textual story data into event sequences. We then present a technique for automated story generation whereby we decompose the problem into the generation of successive events (event2event) and the generation of natural language sentences from events (event2sentence). We give empirical results comparing different event representations and their effects on event successor generation and the translation of events to natural language.

关键词

automated story generation; event representations; recurrent neural networks

全文

PDF


Asymmetric Action Abstractions for Multi-Unit Control in Adversarial Real-Time Games

摘要

Action abstractions restrict the number of legal actions available during search in multi-unit real-time adversarial games, thus allowing algorithms to focus their search on a set of promising actions. Optimal strategies derived from un-abstracted spaces are guaranteed to be no worse than optimal strategies derived from action-abstracted spaces. In practice, however, due to real-time constraints and the state space size, one is only able to derive good strategies in un-abstracted spaces in small-scale games. In this paper we introduce search algorithms that use an action abstraction scheme we call asymmetric abstraction. Asymmetric abstractions retain the un-abstracted spaces' theoretical advantage over regularly abstracted spaces while still allowing the search algorithms to derive effective strategies, even in large-scale games. Empirical results on combat scenarios that arise in a real-time strategy game show that our search algorithms are able to substantially outperform state-of-the-art approaches.

关键词

real-time strategy games; real-time adversarial planning; action abstractions

全文

PDF


PVL: A Framework for Navigating the Precision-Variety Trade-Off in Automated Animation of Smiles

摘要

Animating digital characters has an important role in computer assisted experiences, from video games to movies to interactive robotics. A critical challenge in the field is to generate animations which accurately reflect the state of the animated characters, without looking repetitive or unnatural. In this work, we investigate the problem of procedurally generating a diverse variety of facial animations that express a given semantic quality (e.g., very happy). To that end, we introduce a new learning heuristic called Precision Variety Learning (PVL) which actively identifies and exploits the fundamental trade-off between precision (how accurate positive labels are) and variety (how diverse the set of positive labels is). We both identify conditions where important theoretical properties can be guaranteed, and show good empirical performance in variety of conditions. Lastly, we apply our PVL heuristic to our motivating problem of generating smile animations, and perform several user studies to validate the ability of our method to produce a perceptually diverse variety of smiles for different target intensities.

关键词

Artificial Intelligence; Games; Interactive Entertainment; Machine Learning; Heuristic Optimization; Graphics; Animation

全文

PDF


Utilitarians Without Utilities: Maximizing Social Welfare for Graph Problems Using Only Ordinal Preferences

摘要

We consider ordinal approximation algorithms for a broad class of utility maximization problems for multi-agent systems. In these problems, agents have utilities for connecting to each other, and the goal is to compute a maximum-utility solution subject to a set of constraints. We represent these as a class of graph optimization problems, including matching, spanning tree problems, TSP, maximum weight planar subgraph, and many others. We study these problems in the ordinal setting: latent numerical utilities exist, but we only have access to ordinal preference information, i.e., every agent specifies an ordering over the other agents by preference. We prove that for the large class of graph problems we identify, ordinal information is enough to compute solutions which are close to optimal, thus demonstrating there is no need to know the underlying numerical utilities. For example, for problems in this class with bounded degree b a simple ordinal greedy algorithm always produces a (b + 1)-approximation; we also quantify how the quality of ordinal approximation depends on the sparsity of the resulting graphs. In particular, our results imply that ordinal information is enough to obtain a 2-approximation for Maximum Spanning Tree; a 4-approximation for Max Weight Planar Subgraph; a 2-approximation for Max-TSP; and a 2- approximation for various Matching problems.

关键词

Ordinal Preferences; Approximation Algorithms; Matching; Greedy; TSP; Graph Algorithms

全文

PDF


On the Complexity of Extended and Proportional Justified Representation

摘要

We consider the problem of selecting a fixed-size committee based on approval ballots. It is desirable to have a committee in which all voters are fairly represented. Aziz et al. (2015a; 2017) proposed an axiom called extended justified representation (EJR), which aims to capture this intuition; subsequently, Sanchez-Fernandez et al. (2017) proposed a weaker variant of this axiom called proportional justified representation (PJR). It was shown that it is coNP-complete to check whether a given committee provides EJR, and it was conjectured that it is hard to find a committee that provides EJR. In contrast, there are polynomial-time computable voting rules that output committees providing PJR, but the complexity of checking whether a given committee provides PJR was an open problem. In this paper, we answer open questions from prior work by showing that EJR and PJR have the same worst-case complexity: we provide two polynomial-time algorithms that output committees providing EJR, yet we show that it is coNP-complete to decide whether a given committee provides PJR. We complement the latter result by fixed-parameter tractability results.

关键词

全文

PDF


Rank Maximal Equal Contribution: A Probabilistic Social Choice Function

摘要

When aggregating preferences of agents via voting, two desirable goals are to incentivize agents to participate in the voting process and then identify outcomes that are Pareto efficient. We consider participation as formalized by Brandl, Brandt, and Hofbauer (2015) based on the stochastic dominance (SD) relation. We formulate a new rule called RMEC (Rank Maximal Equal Contribution) that is polynomial-time computable, ex post efficient and satisfies the strongest notion of participation. It also satisfies many other desirable fairness properties. The rule suggests a general approach to achieving very strong participation, ex post efficiency and fairness.

关键词

algorithms; game theory; social choice theory; probabilistic social choice; voting mechanisms; random serial dictatorship; ex post efficient; participation; fairness

全文

PDF


Groupwise Maximin Fair Allocation of Indivisible Goods

摘要

We study the problem of allocating indivisible goods among n agents in a fair manner. For this problem, maximin share (MMS) is a well-studied solution concept which provides a fairness threshold. Specifically, maximin share is defined as the minimum utility that an agent can guarantee for herself when asked to partition the set of goods into n bundles such that the remaining (n-1) agents pick their bundles adversarially. An allocation is deemed to be fair if every agent gets a bundle whose valuation is at least her maximin share. Even though maximin shares provide a natural benchmark for fairness, it has its own drawbacks and, in particular, it is not sufficient to rule out unsatisfactory allocations. Motivated by these considerations, in this work we define a stronger notion of fairness, called groupwise maximin share guarantee (GMMS). In GMMS, we require that the maximin share guarantee is achieved not just with respect to the grand bundle, but also among all the subgroups of agents. Hence, this solution concept strengthens MMS and provides an ex-post fairness guarantee. We show that in specific settings, GMMS allocations always exist. We also establish the existence of approximate GMMS allocations under additive valuations, and develop a polynomial-time algorithm to find such allocations. Moreover, we establish a scale of fairness wherein we show that GMMS implies approximate envy freeness. Finally, we empirically demonstrate the existence of GMMS allocations in a large set of randomly generated instances. For the same set of instances, we additionally show that our algorithm achieves an approximation factor better than the established, worst-case bound.

关键词

Fair division; Maximin shares; Envy free allocations

全文

PDF


Truthful and Near-Optimal Mechanisms for Welfare Maximization in Multi-Winner Elections

摘要

Mechanisms for aggregating the preferences of agents in elections need to balance many different considerations, including efficiency, information elicited from agents, and manipulability. We consider the utilitarian social welfare of mechanisms for preference aggregation, measured by the distortion. We show that for a particular input format called threshold approval voting, where each agent is presented with an independently chosen threshold, there is a mechanism with nearly optimal distortion when the number of voters is large. Threshold mechanisms are potentially manipulable, but place a low informational burden on voters. We then consider truthful mechanisms. For the widely-studied class of ordinal mechanisms which elicit the rankings of candidates from each agent, we show that truthfulness essentially imposes no additional loss of welfare. We give truthful mechanisms with distortion O(√m log m) for k-winner elections, and distortion O(√m log m) when candidates have arbitrary costs, in elections with m candidates. These nearly match known lower bounds for ordinal mechanisms that ignore the strategic behavior. We further tighten these lower bounds and show that for truthful mechanisms our first upper bound is tight. Lastly, when agents decide between two candidates, we give tight bounds on the distortion for truthful mechanisms.

关键词

multi-winner elections; participatory budgeting; distortion

全文

PDF


Multiwinner Elections With Diversity Constraints

摘要

We develop a model of multiwinner elections that combines performance-based measures of the quality of the committee (such as, e.g., Borda scores of the committee members) with diversity constraints. Specifically, we assume that the candidates have certain attributes (such as being a male or a female, being junior or senior, etc.) and the goal is to elect a committee that, on the one hand, has as high a score regarding a given performance measure, but that, on the other hand, meets certain requirements (e.g., of the form "at least 30% of the committee members are junior candidates and at least 40% are females"). We analyze the computational complexity of computing winning committees in this model, obtaining polynomial-time algorithms (exact and approximate) and NP-hardness results. We focus on several natural classes of voting rules and diversity constraints.

关键词

multi-winner elections; approximation algorithms; diversity; computational social choice; voting

全文

PDF


A Bayesian Clearing Mechanism for Combinatorial Auctions

摘要

We cast the problem of combinatorial auction design in a Bayesian framework in order to incorporate prior information into the auction process and minimize the number of rounds to convergence. We first develop a generative model of agent valuations and market prices such that clearing prices become maximum a posteriori estimates given observed agent valuations. This generative model then forms the basis of an auction process which alternates between refining estimates of agent valuations and computing candidate clearing prices. We provide an implementation of the auction using assumed density filtering to estimate valuations and expectation maximization to compute prices. An empirical evaluation over a range of valuation domains demonstrates that our Bayesian auction mechanism is highly competitive against the combinatorial clock auction in terms of rounds to convergence, even under the most favorable choices of price increment for this baseline.

关键词

全文

PDF


AIVAT: A New Variance Reduction Technique for Agent Evaluation in Imperfect Information Games

摘要

Evaluating agent performance when outcomes are stochastic and agents use randomized strategies can be challenging when there is limited data available. The variance of sampled outcomes may make the simple approach of Monte Carlo sampling inadequate. This is the case for agents playing heads-up no-limit Texas hold'em poker, whereman-machine competitions typically involve multiple days of consistent play by multiple players, but still can (and sometimes did) result in statistically insignificant conclusions. In this paper, we introduce AIVAT, a low variance, provably unbiased value assessment tool that exploits an arbitrary heuristic estimate of state value, as well as the explicit strategy of a subset of the agents. Unlike existing techniques which reduce the variance from chance events, or only consider game ending actions, AIVAT reduces the variance both from choices by nature and by players with a known strategy. The resulting estimator produces results that significantly outperform previous state of the art techniques. It was able to reduce the standard deviation of a Texas hold'em poker man-machine match by 85\% and consequently requires 44 times fewer games to draw the same statistical conclusion. AIVAT enabled the first statistically significant AI victory against professional poker players in no-limit hold'em.Furthermore, the technique was powerful enough to produce statistically significant results versus individual players, not just an aggregate pool of the players. We also used AIVAT to analyze a short series of AI vs human poker tournaments,producing statistical significant results with as few as 28 matches.

关键词

game theory; variance reduction; poker

全文

PDF


Reinforcement Mechanism Design for Fraudulent Behaviour in e-Commerce

摘要

In large e-commerce websites, sellers have been observed to engage in fraudulent behaviour, faking historical transactions in order to receive favourable treatment from the platforms, specifically through the allocation of additional buyer impressions which results in higher revenue for them, but not for the system as a whole. This emergent phenomenon has attracted considerable attention, with previous approaches focusing on trying to detect illicit practices and to punish the miscreants. In this paper, we employ the principles of reinforcement mechanism design, a framework that combines the fundamental goals of classical mechanism design, i.e. the consideration of agents' incentives and their alignment with the objectives of the designer, with deep reinforcement learning for optimizing the performance based on these incentives. In particular, first we set up a deep-learning framework for predicting the sellers' rationality, based on real data from any allocation algorithm. We use data from one of largest e-commerce platforms worldwide and train a neural network model to predict the extent to which the sellers will engage in fraudulent behaviour. Using this rationality model, we employ an algorithm based on deep reinforcement learning to optimize the objectives and compare its performance against several natural heuristics, including the platform's implementation and incentive-based mechanisms from the related literature.

关键词

reinforcement mechanism design; reinforcement learning; e-commerce

全文

PDF


Computational Results for Extensive-Form Adversarial Team Games

摘要

We provide, to the best of our knowledge, the first computational study of extensive-form adversarial team games. These games are sequential, zero-sum games in which a team of players, sharing the same utility function, faces an adversary. We define three different scenarios according to the communication capabilities of the team. In the first, the teammates can communicate and correlate their actions both before and during the play. In the second, they can only communicate before the play. In the third, no communication is possible at all. We define the most suitable solution concepts, and we study the inefficiency caused by partial or null communication, showing that the inefficiency can be arbitrarily large in the size of the game tree. Furthermore, we study the computational complexity of the equilibrium-finding problem in the three scenarios mentioned above, and we provide, for each of the three scenarios, an exact algorithm. Finally, we empirically evaluate the scalability of the algorithms in random games and the inefficiency caused by partial or null communication.

关键词

Equilibrium computation; extensive-form games; team games

全文

PDF


On the Distortion of Voting With Multiple Representative Candidates

摘要

We study positional voting rules when candidates and voters are embedded in a common metric space, and cardinal preferences are naturally given by distances in the metric space. In a positional voting rule, each candidate receives a score from each ballot based on the ballot's rank order; the candidate with the highest total score wins the election. The cost of a candidate is his sum of distances to all voters, and the distortion of an election is the ratio between the cost of the elected candidate and the cost of the optimum candidate. We consider the case when candidates are representative of the population, in the sense that they are drawn i.i.d. from the population of the voters, and analyze the expected distortion of positional voting rules. Our main result is a clean and tight characterization of positional voting rules that have constant expected distortion (independent of the number of candidates and the metric space). Our characterization result immediately implies constant expected distortion for Borda Count and elections in which each voter approves a constant fraction of all candidates. On the other hand, we obtain super-constant expected distortion for Plurality, Veto, and approving a constant number of candidates.These results contrast with previous results on voting with metric preferences: When the candidates are chosen adversarially, all of the preceding voting rules have distortion linear in the number of candidates or voters. Thus, the model of representative candidates allows us to distinguish voting rules which seem equally bad in the worst case.

关键词

social choice, voting, metric preferences

全文

PDF


Disarmament Games With Resource

摘要

A paper by Deng and Conitzer in AAAI'17 introduces disarmament games, in which players alternatingly commit not to play certain pure strategies. However, in practice, disarmament usually does not consist in removing a strategy, but rather in removing a resource (and doing so rules out all the strategies in which that resource is used simultaneously). In this paper, we introduce a model of disarmament games in which resources, rather than strategies, are removed. We prove NP-completeness of several formulations of the problem of achieving desirable outcomes via disarmament. We then study the case where resources can be fractionally removed, and prove a result analogous to the folk theorem that all desirable outcomes can be achieved. We show that we can approximately achieve any desirable outcome in a polynomial number of rounds, though determining whether a given outcome can be obtained in a given number of rounds remains NP-complete.

关键词

全文

PDF


Computing the Strategy to Commit to in Polymatrix Games

摘要

Leadership games provide a powerful paradigm to model many real-world settings. Most literature focuses on games with a single follower who acts optimistically, breaking ties in favour of the leader. Unfortunately, for real-world applications, this is unlikely. In this paper, we look for efficiently solvable games with multiple followers who play either optimistically or pessimistically, i.e., breaking ties in favour or against the leader. We study the computational complexity of finding or approximating an optimistic or pessimistic leader-follower equilibrium in specific classes of succinct games—polymatrix like—which are equivalent to 2-player Bayesian games with uncertainty over the follower, with interdependent or independent types. Furthermore, we provide an exact algorithm to find a pessimistic equilibrium for those game classes. Finally, we show that in general polymatrix games the computation is harder even when players are forced to play pure strategies.

关键词

Equilibrium Computation; Leader-Follower; Polymatrix Games

全文

PDF


Resource Allocation Polytope Games: Uniqueness of Equilibrium, Price of Stability, and Price of Anarchy

摘要

We consider a two-player resource allocation polytope game, in which the strategy of a player is restricted by the strategy of the other player, with common coupled constraints. With respect to such a game, we formally introduce the notions of independent optimal strategy profile, which is the profile when players play optimally in the absence of the other player; and common contiguous set, which is the set of top nodes in the preference orderings of both the players that are exhaustively invested on in the independent optimal strategy profile. We show that for the game to have a unique PSNE, it is a necessary and sufficient condition that the independent optimal strategies of the players do not conflict, and either the common contiguous set consists of at most one node or all the nodes in the common contiguous set are invested on by only one player in the independent optimal strategy profile. We further derive a socially optimal strategy profile, and show that the price of anarchy cannot be bound by a common universal constant. We hence present an efficient algorithm to compute the price of anarchy and the price of stability, given an instance of the game. Under reasonable conditions, we show that the price of stability is 1. We encounter a paradox in this game that higher budgets may lead to worse outcomes.

关键词

Resource allocation; polytope games; Nash equilibrium; common coupled constraints; price of stability; price of anarchy

全文

PDF


Allocation Problems in Ride-Sharing Platforms: Online Matching With Offline Reusable Resources

摘要

Bipartite matching markets pair agents on one side of a market with agents, items, or contracts on the opposing side. Prior work addresses online bipartite matching markets, where agents arrive over time and are dynamically matched to a known set of disposable resources. In this paper, we propose a new model, Online Matching with (offline) Reusable Resources under Known Adversarial Distributions (OM-RR-KAD), in which resources on the offline side are reusable instead of disposable; that is, once matched, resources become available again at some point in the future. We show that our model is tractable by presenting an LP-based adaptive algorithm that achieves an online competitive ratio of 1/2 − ε for any given ε > 0. We also show that no non-adaptive algorithm can achieve a ratio of 1/2 + o(1) based on the same benchmark LP. Through a data-driven analysis on a massive openly-available dataset, we show our model is robust enough to capture the application of taxi dispatching services and ride-sharing systems. We also present heuristics that perform well in practice.

关键词

Online Matching; Randomized Algorithms; Ride-Sharing

全文

PDF


Tool Auctions

摘要

We introduce tool auctions, a novel market mechanism for constructing a cost-efficient assembly line for producing a desired set of products from a given set of goods and tools. Such tools can be used to transform one type of good into a different one. We then study the computational complexity of tool auctions in detail, using methods from both classical and parameterized complexity theory. While solving such auctions is intractable in general, just as for the related frameworks of combinatorial and mixed auctions, we are able to identify several special cases of practical interest where designing efficient algorithms is possible.

关键词

全文

PDF


Effective Heuristics for Committee Scoring Rules

摘要

Committee scoring rules form an important class of multiwinner voting rules. As computing winning committees under such rules is generally intractable, in this paper we investigate efficient heuristics for this task. We design two novel heuristics for computing approximate results of multiwinner elections under arbitrary committee scoring rules; notably, one of these heuristics uses concepts from cooperative game theory. We then provide an experimental evaluation of our heuristics (and two others, known from the literature): we compare the scores of the committees output by our algorithms to the scores of the optimal committees, and also use the two-dimensional Euclidean domain to compare the visual representations of the outputs of our algorithms.

关键词

social choice; voting rules; committees; heuristics; submodular optimization; spatial preferences; cooperative game theory; power indices

全文

PDF


On Social Envy-Freeness in Multi-Unit Markets

摘要

We consider a market setting in which buyers are individuals of a population, whose relationships are represented by an underlying social graph. Given buyers valuations for the items being sold, an outcome consists of a pricing of the objects and an allocation of bundles to the buyers. An outcome is social envy-free if no buyer strictly prefers the bundles of her neighbors in the social graph. We focus on the revenue maximization problem in multi-unit markets, in which there are multiple copies of a same item being sold and each buyer is assigned a set of identical items. We consider the four different cases arising by considering different buyers valuations, i.e., single-minded or general, and by adopting different forms of pricing, that is item- or bundle-pricing. For all the above cases we show the hardness of the revenue maximization problem and give corresponding approximation results. All our approximation bounds are optimal or nearly optimal. Moreover, we provide an optimal allocation algorithm for general valuations with item-pricing, under the assumption of social graphs of bounded treewidth. Finally, we determine optimal bounds on the corresponding price of envy-freeness, that is on the worst case ratio between the maximum revenue that can be achieved without envy-freeness constraints, and the one obtainable in case of social relationships. Some of our results close hardness open questions or improve already known ones in the literature concerning the classical setting without sociality.

关键词

Markets; Envy-free pricing; Social Networks

全文

PDF


Facility Location Games With Fractional Preferences

摘要

In this paper, we propose a fractional preference model for the facility location game with two facilities that serve the similar purpose on a line where each agent has his location information as well as fractional preference to indicate how well they prefer the facilities. The preference for each facility is in the range of [0, L] such that the sum of the preference for all facilities is equal to 1. The utility is measured by subtracting the sum of the cost of both facilities from the total length L where the cost of facilities is defined as the multiplication of the fractional preference and the distance between the agent and the facilities. We first show that the lower bound for the objective of minimizing total cost is at least Ω(n^1/3). Hence, we use the utility function to analyze the agents' satification. Our objective is to place two facilities on [0, L] to maximize the social utility or the minimum utility. For each objective function, we propose deterministic strategy-proof mechanisms. For the objective of maximizing the social utility, we present an optimal deterministic strategy-proof mechanism in the case where agents can only misreport their locations. In the case where agents can only misreport their preferences, we present a 2-approximation deterministic strategy-proof mechanism. Finally, we present a 4-approximation deterministic strategy-proof mechanism and a randomized strategy-proof mechanism with an approximation ratio of 2 where agents can misreport both the preference and location information. Moreover, we also give a lower-bound of 1.06. For the objective of maximizing the minimum utility, we give a lower-bound of 1.5 and present a 2-approximation deterministic strategy-proof mechanism where agents can misreport both the preference and location.

关键词

Algorithmic Mechanism Design; Mechanisms without Money; Facility Location

全文

PDF


The Complexity of Bribery in Network-Based Rating Systems

摘要

We study the complexity of bribery in a network-based rating system, where individuals are connected in a social network and an attacker, typically a service provider, can influence their rating and increase the overall profit. We derive a number of algorithmic properties of this framework, in particular we show that establishing the existence of an optimal manipulation strategy for the attacker is NP-complete, even with full knowledge of the underlying network structure.

关键词

Rating-Systems; Bribery; Complexity;

全文

PDF


Weighted Voting Via No-Regret Learning

摘要

Voting systems typically treat all voters equally. We argue that perhaps they should not: Voters who have supported good choices in the past should be given higher weight than voters who have supported bad ones. To develop a formal framework for desirable weighting schemes, we draw on no-regret learning. Specifically, given a voting rule, we wish to design a weighting scheme such that applying the voting rule, with voters weighted by the scheme, leads to choices that are almost as good as those endorsed by the best voter in hindsight. We derive possibility and impossibility results for the existence of such weighting schemes, depending on whether the voting rule and the weighting scheme are deterministic or randomized, as well as on the social choice axioms satisfied by the voting rule.

关键词

全文

PDF


Cooperative Games With Bounded Dependency Degree

摘要

Cooperative games provide a framework to study cooperation among self-interested agents. They offer a number of solution concepts describing how the outcome of the cooperation should be shared among the players. Unfortunately, computational problems associated with many of these solution concepts tend to be intractable---NP-hard or worse. In this paper, we incorporate complexity measures recently proposed by Feige and Izsak (2013), called dependency degree and supermodular degree, into the complexity analysis of coopera- tive games. We show that many computational problems for cooperative games become tractable for games whose dependency degree or supermodular degree are bounded. In particular, we prove that simple games admit efficient algorithms for various solution concepts when the supermodular degree is small; further, we show that computing the Shapley value is always in FPT with respect to the dependency degree. Finally, we observe that, while determining the dependency among players is computationally hard, there are efficient algorithms for special classes of games.

关键词

全文

PDF


Committee Selection with Intraclass and Interclass Synergies

摘要

Voting is almost never done in void, as usually there are some relations between the alternatives on which the voters vote on. These relations shall be taken into consideration when selecting a winning committee of some given multiwinner election. As taking into account all possible relations between the alternatives is generally computationally intractable, in this paper we consider classes of alternatives; intuitively, the number of classes is significantly smaller than the number of alternatives, and thus there is some hope in reaching computational tractability. We model both intraclass relations and interclass relations by functions, which we refer to as synergy functions, and study the computational complexity of identifying the best committee, taking into account those synergy functions. Our model accommodates both positive and negative relations between alternatives; further, our efficient algorithms can also deal with a rich class of diversity wishes, which we show how to model using synergy functions.

关键词

multiwinner elections; diversity

全文

PDF


On Recognising Nearly Single-Crossing Preferences

摘要

If voters' preferences are one-dimensional, many hard problems in computational social choice become tractable. A preference profile can be classified as one-dimensional if it has the single-crossing property, which requires that the voters can be ordered from left to right so that their preferences are consistent with this order. In practice, preferences may exhibit some one-dimensional structure, despite not being single-crossing in the formal sense. Hence, we ask whether one can identify preference profiles that are close to being single-crossing. We consider three distance measures, which are based on partitioning voters or candidates or performing a small number of swaps in each vote. We prove that it can be efficiently decided if voters can be split into two single-crossing groups. Also, for every fixed k >= 1 we can decide in polynomial time if a profile can be made single-crossing by performing at most k candidate swaps per vote. In contrast, for each k >= 3 it is NP-complete to decide whether candidates can be partitioned into k sets so that the restriction of the input profile to each set is single-crossing.

关键词

social choice; structured preferences; preferences; single-crossing; single-peaked; recognition algorithm

全文

PDF


Ranking Wily People Who Rank Each Other

摘要

We study rank aggregation algorithms that take as input the opinions of players over their peers, represented as rankings, and output a social ordering of the players (which reflects, e.g., relative contribution to a project or fit for a job). To prevent strategic behavior, these algorithms must be impartial, i.e., players should not be able to influence their own position in the output ranking. We design several randomized algorithms that are impartial and closely emulate given (non-impartial) rank aggregation rules in a rigorous sense. Experimental results further support the efficacy and practicability of our algorithms.

关键词

peer ranking; rank aggregation; impartiality

全文

PDF


Liquid Democracy: An Algorithmic Perspective

摘要

We study liquid democracy, a collective decision making paradigm that allows voters to transitively delegate their votes, through an algorithmic lens. In our model, there are two alternatives, one correct and one incorrect, and we are interested in the probability that the majority opinion is correct. Our main question is whether there exist delegation mechanisms that are guaranteed to outperform direct voting, in the sense of being always at least as likely, and sometimes more likely, to make a correct decision. Even though we assume that voters can only delegate their votes to better-informed voters, we show that local delegation mechanisms, which only take the local neighborhood of each voter as input (and, arguably, capture the spirit of liquid democracy), cannot provide the foregoing guarantee. By contrast, we design a non-local delegation mechanism that does provably outperform direct voting under mild assumptions about voters.

关键词

liquid democracy, voting

全文

PDF


Policy Learning for Continuous Space Security Games Using Neural Networks

摘要

A wealth of algorithms centered around (integer) linear programming have been proposed to compute equilibrium strategies in security games with discrete states and actions. However, in practice many domains possess continuous state and action spaces. In this paper, we consider a continuous space security game model with infinite-size action sets for players and present a novel deep learning based approach to extend the existing toolkit for solving security games. Specifically, we present (i) OptGradFP, a novel and general algorithm that searches for the optimal defender strategy in a parameterized continuous search space, and can also be used to learn policies over multiple game states simultaneously; (ii) OptGradFP-NN, a convolutional neural network based implementation of OptGradFP for continuous space security games. We demonstrate the potential to predict good defender strategies via experiments and analysis of OptGradFP and OptGradFP-NN on discrete and continuous game settings.

关键词

Stackelberg Security Games; Game Theory; Nash Equilibrium; Stackelberg Equilibrium; Defender Policy Optimization; Policy Gradient; Fictitious Play

全文

PDF


Approximately Stable Matchings With Budget Constraints

摘要

This paper examines two-sided matching with budget constraints where one side (a firm or hospital) can make monetary transfers (offer wages) to the other (a worker or doctor). In a standard model, while multiple doctors can be matched to a single hospital, a hospital has a maximum quota; thus, the number of doctors assigned to a hospital cannot exceed a certain limit. In our model, in contrast, a hospital has a fixed budget; that is, the total amount of wages allocated by each hospital to doctors is constrained. With budget constraints, stable matchings may fail to exist and checking for the existence is hard. To deal with the nonexistence of stable matchings, we extend the "matching with contracts" model of Hatfield and Milgrom so that it deals with approximately stable matchings where each of the hospitals' utilities after deviation can increase by a factor up to a certain amount. We then propose two novel mechanisms that efficiently return a stable matching that exactly satisfies the budget constraints. Specifically, by sacrificing strategy-proofness, our first mechanism achieves the best possible bound. We also explore a special case on which a simple mechanism is strategy-proof for doctors, while maintaining the best possible bound of the general case.

关键词

Game theory; Mechanism design; Two-sided matching; Budget constraints

全文

PDF


Approximating Bribery in Scoring Rules

摘要

The classic bribery problem is to find a minimal subset of voters who need to change their vote to make some preferred candidate win.We find an approximate solution for this problem for a broad family of scoring rules (which includes Borda and t-approval), in the following sense: if there is a strategy which requires bribing k voters, we efficiently find a strategy which requires bribing at most k + Õ(√k) voters. Our algorithm is based on a randomized reduction from bribery to coalitional manipulation (UCM). To solve the UCM problem, we apply the Birkhoff-von Neumann (BvN) decomposition to a fractional manipulation matrix. This allows us to limit the size of the possible ballot search space reducing it from exponential to polynomial, while still obtaining good approximation guarantees. Finding the optimal solution in the truncated search space yields a new algorithm for UCM, which is of independent interest.

关键词

Voting; Election; Approximation; Coalitional Manipulation

全文

PDF


Robust Stackelberg Equilibria in Extensive-Form Games and Extension to Limited Lookahead

摘要

Stackelberg equilibria have become increasingly important as a solution concept in computational game theory, largely inspired by practical problems such as security settings. In practice, however, there is typically uncertainty regarding the model about the opponent. This paper is, to our knowledge, the first to investigate Stackelberg equilibria under uncertainty in extensive-form games, one of the broadest classes of game. We introduce robust Stackelberg equilibria, where the uncertainty is about the opponent’s payoffs, as well as ones where the opponent has limited lookahead and the uncertainty is about the opponent’s node evaluation function. We develop a new mixed-integer program for the deterministic limited-lookahead setting. We then extend the program to the robust setting for Stackelberg equilibrium under unlimited and under limited lookahead by the opponent. We show that for the specific case of interval uncertainty about the opponent’s payoffs (or about the opponent’s node evaluations in the case of limited lookahead), robust Stackelberg equilibria can be computed with a mixed-integer program that is of the same asymptotic size as that for the deterministic setting.

关键词

extensive-form game, security game, Stackelberg equilibrium, Robust, limited lookahead

全文

PDF


The Conference Paper Assignment Problem: Using Order Weighted Averages to Assign Indivisible Goods

摘要

We propose a novel mechanism for solving the assignment problem when we have a two sided matching problem with preferences from one side (the agents/reviewers) over the other side (the objects/papers) and both sides have capacity constraints. The assignment problem is a fundamental in both computer science and economics with application in many areas including task and resource allocation. Drawing inspiration from work in multi-criteria decision making and social choice theory we use order weighted averages (OWAs), a parameterized class of mean aggregators, to propose a novel and flexible class of algorithms for the assignment problem. We show an algorithm for finding an SUM-OWA assignment in polynomial time, in contrast to the NP-hardness of finding an egalitarian assignment. We demonstrate through empirical experiments that using SUM-OWA assignments can lead to high quality and more fair assignments.

关键词

social choice; matching; assignment

全文

PDF


Incentivizing High Quality User Contributions: New Arm Generation in Bandit Learning

摘要

We study the problem of incentivizing high quality contributions in user generated content platforms, in which users arrive sequentially with unknown quality. We are interested in designing a content displaying strategy which decides which content should be chosen to show to users, with the goal of maximizing user experience (i.e., the likelihood of users liking the content).This goal naturally leads to a joint problem of incentivizing high quality contributions and learning the unknown content quality. To address the incentive issue, we consider a model in which users are strategic in deciding whether to contribute and are motivated by exposure, i.e., they aim to maximize the number of times their contributions are viewed. For the learning perspective, we model the content quality as the probability of obtaining positive feedback (e.g., like or upvote) from a random user. Naturally, the platform needs to resolve the classical trade-off between exploration (collecting feedback for all content) and exploitation (displaying the best content). We formulate this problem as a multi-arm bandit problem, where the number of arms (i.e., contributions) is increasing over time and depends on the strategic choices of arriving users. We first show that applying standard bandit algorithms incentivizes a flood of low cost contributions, which in turn leads to linear regret. We then propose Rand_UCB which adds an additional layer of randomization on top of the UCB algorithm to address the issue of flooding contributions. We show that Rand_UCB helps eliminate the incentives for low quality contributions, provides incentives for high quality contributions (due to bounded number of explorations for the low quality ones), and achieves sub-linear regrets with respect to displaying the current best arms.

关键词

incentive;game theory;user generated content; multi-armed bandit

全文

PDF


On the Approximation of Nash Equilibria in Sparse Win-Lose Games

摘要

We show that the problem of finding an approximate Nash equilibrium with a polynomial precision is PPAD-hard even for two-player sparse win-lose games (i.e., games with {0,1}-entries such that each row and column of the two n×n payoff matrices have at most O(log n) many ones). The proof is mainly based on a new class of prototype games called Chasing Games, which we think is of independent interest in understanding the complexity of Nash equilibrium.

关键词

PPAD; Nash Equilibrium; Game Theory

全文

PDF


Balancing Lexicographic Fairness and a Utilitarian Objective With Application to Kidney Exchange

摘要

Balancing fairness and efficiency in resource allocation is a classical economic and computational problem. The price of fairness measures the worst-case loss of economic efficiency when using an inefficient but fair allocation rule; for indivisible goods in many settings, this price is unacceptably high. One such setting is kidney exchange, where needy patients swap willing but incompatible kidney donors. In this work, we close an open problem regarding the theoretical price of fairness in modern kidney exchanges. We then propose a general hybrid fairness rule that balances a strict lexicographic preference ordering over classes of agents, and a utilitarian objective that maximizes economic efficiency. We develop a utility function for this rule that favors disadvantaged groups lexicographically; but if cost to overall efficiency becomes too high, it switches to a utilitarian objective. This rule has only one parameter which is proportional to a bound on the price of fairness, and can be adjusted by policymakers. We apply this rule to real data from a large kidney exchange and show that our hybrid rule produces more reliable outcomes than other fairness rules.

关键词

Game Theory; Multiagent Systems;Fair Division

全文

PDF


Single-Peakedness and Total Unimodularity: New Polynomial-Time Algorithms for Multi-Winner Elections

摘要

The winner determination problems of many attractive multi-winner voting rules are NP-complete. However, they often admit polynomial-time algorithms when restricting inputs to be single-peaked. Commonly, such algorithms employ dynamic programming along the underlying axis. We introduce a new technique: carefully chosen integer linear programming (IP) formulations for certain voting problems admit an LP relaxation which is totally unimodular if preferences are single-peaked, and which thus admits an integral optimal solution. This technique gives efficient algorithms for finding optimal committees under Proportional Approval Voting (PAV) and the Chamberlin-Courant rule with single-peaked preferences, as well as for certain OWA-based rules. For PAV, this is the first technique able to efficiently find an optimal committee when preferences are single-peaked. An advantage of our approach is that no special-purpose algorithm needs to be used to exploit structure in the input preferences: any standard IP solver will terminate in the first iteration if the input is single-peaked, and will continue to work otherwise.

关键词

social choice; voting; total unimodularity; linear programming; single-peaked preferences; structured preferences; approval voting

全文

PDF


Fair Rent Division on a Budget

摘要

The standard approach to fair rent division assumes that agents have quasi-linear utilities, and seeks allocations that are envy free; it underlies an algorithm that is widely used in practice. However, this approach does not take budget constraints into account, and, therefore, may assign agents to rooms they cannot afford. By contrast, we design a polynomial-time algorithm that takes budget constraints as part of its input; it determines whether there exist envy-free allocations that satisfy the budget constraints, and, if so, computes one that optimizes an additional criterion of justice. In particular, this gives a polynomial-time implementation of the budget-constrained maximin solution, where the maximization objective is the minimum utility of any agent. We show that, like its non-budget-constrained counterpart, this solution is unique in terms of utilities (when it exists), and satisfies additional desirable properties.

关键词

全文

PDF


Approximation-Variance Tradeoffs in Facility Location Games

摘要

We revisit the well-studied problem of constructing strategyproof approximation mechanisms for facility location games, but offer a fundamentally new perspective by considering risk averse designers. Specifically, we are interested in the tradeoff between a randomized strategyproof mechanism's approximation ratio, and its variance (which has long served as a proxy for risk). When there is just one facility, we observe that the social cost objective is trivial, and derive the optimal tradeoff with respect to the maximum cost objective. When there are multiple facilities, the main challenge is the social cost objective, and we establish a surprising impossibility result: under mild assumptions, no smooth approximation-variance tradeoff exists. We also discuss the implications of our work for computational mechanism design at large.

关键词

Mechanism Design; Approximation; Variance; Tradeoff; Facility Location Games

全文

PDF


MUDA: A Truthful Multi-Unit Double-Auction Mechanism

摘要

In a seminal paper, McAfee (1992) presented a truthful mechanism for double auctions, attaining asymptotically-optimal gain-from-trade without any prior information on the valuations of the traders. McAfee's mechanism handles single-parametric agents, allowing each seller to sell a single unit and each buyer to buy a single unit. This paper presents a double-auction mechanism that handles multi-parametric agents and allows multiple units per trader, as long as the valuation functions of all traders have decreasing marginal returns. The mechanism is prior-free, ex-post individually-rational, dominant-strategy truthful and strongly-budget-balanced. Its gain-from-trade approaches the optimum when the market size is sufficiently large.

关键词

Mechanism design; Auctions; Bilateral trade; Markets

全文

PDF


Traffic Optimization for a Mixture of Self-Interested and Compliant Agents

摘要

This paper focuses on two commonly used path assignment policies for agents traversing a congested network: self-interested routing, and system-optimum routing. In the self-interested routing policy each agent selects a path that optimizes its own utility, while in the system-optimum routing, agents are assigned paths with the goal of maximizing system performance. This paper considers a scenario where a centralized network manager wishes to optimize utilities over all agents, i.e., implement a system-optimum routing policy. In many real-life scenarios, however, the system manager is unable to influence the route assignment of all agents due to limited influence on route choice decisions. Motivated by such scenarios, a computationally tractable method is presented that computes the minimal amount of agents that the system manager needs to influence (compliant agents) in order to achieve system optimal performance. Moreover, this methodology can also determine whether a given set of compliant agents is sufficient to achieve system optimum and compute the optimal route assignment for the compliant agents to do so. Experimental results are presented showing that in several large-scale, realistic traffic networks optimal flow can be achieved with as low as 13% of the agent being compliant and up to 54%.

关键词

autonomous agent; traffic optimization; network flow; mixed equilibrium; routing games

全文

PDF


Coalition Manipulation of Gale-Shapley Algorithm

摘要

It is well-known that the Gale-Shapley algorithm is not truthful for all agents. Previous studies in this category concentrate on manipulations using incomplete preference lists by a single woman and by the set of all women. Little is known about manipulations by a subset of women. In this paper, we consider manipulations by any subset of women with arbitrary preferences. We show that a strong Nash equilibrium of the induced manipulation game always exists among the manipulators and the equilibrium outcome is unique and Pareto-dominant. In addition, the set of matchings achievable by manipulations has a lattice structure. We also examine the super-strong Nash equilibrium in the end.

关键词

gale-shapley algorithm; stable matching; coalition manipulation

全文

PDF


Axioms for Distance-Based Centralities

摘要

We study the class of distance-based centralities that consists of centrality measures that depend solely on distances to other nodes in the graph. This class encompasses a number of centrality measures, including the classical Degree and Closeness Centralities, as well as their extensions: the Harmonic, Reach and Decay Centralities. We axiomatize the class of distance-based centralities and study what conditions are imposed by the axioms proposed in the literature. Building upon our analysis, we propose the class of additive distance-based centralities and pin-point properties which combined with the axiomatic characterization of the whole class uniquely characterize a number of centralities from the literature.

关键词

Axioms; Centrality; Graphs

全文

PDF


Non-Exploitable Protocols for Repeated Cake Cutting

摘要

We introduce the notion of exploitability in cut-and-choose protocols for repeated cake cutting. If a cut-and-choose protocol is repeated, the cutter can possibly gain information about the chooser from her previous actions, and exploit this information for her own gain, at the expense of the chooser. We define a generalization of cut-and-choose protocols - forced-cut protocols - in which some cuts are made exogenously while others are made by the cutter, and show that there exist non-exploitable forced-cut protocols that use a small number of cuts per day: When the cake has at least as many dimensions as days, we show a protocol that uses a single cut per day. When the cake is 1-dimensional, we show an adaptive non-exploitable protocol that uses 3 cuts per day, and a non-adaptive protocol that uses n cuts per day (where n is the number of days). In contrast, we show that no non-adaptive non-exploitable forced-cut protocol can use a constant number of cuts per day. Finally, we show that if the cake is at least 2-dimensional, there is a non-adaptive non-exploitable protocol that uses 3 cuts per day.

关键词

cake cutting; fair division; dynamic fair division; repeated cake cutting; exploitability

全文

PDF


Modelling Iterative Judgment Aggregation

摘要

We introduce a formal model of iterative judgment aggregation, enabling the analysis of scenarios in which agents repeatedly update their individual positions on a set of issues, before a final decision is made by applying an aggregation rule to these individual positions. Focusing on two popular aggregation rules, the premise-based rule and the plurality rule, we study under what circumstances convergence to an equilibrium can be guaranteed. We also analyse the quality, in social terms, of the final decisions obtained. Our results not only shed light on the parameters that determine whether iteration converges and is socially beneficial, but they also clarify important differences between iterative judgment aggregation and the related framework of iterative voting.

关键词

全文

PDF


Rich Coalitional Resource Games

摘要

We propose a simple model of interaction for resource-conscious agents. The resources involved are expressed in fragments of Linear Logic. We investigate a few problems relevant to cooperative games, such as deciding whether a group of agents can form a coalition and act together in a way that satisfies all of them. In terms of solution concepts, we study the computational aspects of the core of a game. The main contributions are a formal link with the existing literature, and complexity results for several classes of models.

关键词

cooperative games; core; resources; complexity

全文

PDF


It Takes (Only) Two: Adversarial Generator-Encoder Networks

摘要

We present a new autoencoder-type architecture that is trainable in an unsupervised mode, sustains both generation and inference, and has the quality of conditional and unconditional samples boosted by adversarial learning. Unlike previous hybrids of autoencoders and adversarial networks, the adversarial game in our approach is set up directly between the encoder and the generator, and no external mappings are trained in the process of learning.The game objective compares the divergences of each of the real and the generated data distributions with the prior distribution in the latent space. We show that direct generator-vs-encoder game leads to a tight coupling of the two components, resulting in samples and reconstructions of a comparable quality to some recently-proposed more complex architectures.

关键词

adversarial learning

全文

PDF


An Axiomatization of the Eigenvector and Katz Centralities

摘要

Feedback centralities are one of the key classes of centrality measures. They assess the importance of a vertex recursively, based on the importance of its neighbours. Feedback centralities includes the Eigenvector Centrality, as well as its variants, such as the Katz Centrality or the PageRank, and are used in various AI applications, such as ranking the importance of websites on the Internet and most influential users in the Twitter social network. In this paper, we study the theoretical underpinning of the feedback centralities. Specifically, we propose a novel axiomatization of the Eigenvector Centrality and the Katz Centrality based on six simple requirements. Our approach highlights the similarities and differences between both centralities which may help in choosing the right centrality for a specific application.

关键词

Axioms; Eigenvector Centrality; Katz Centrality

全文

PDF


A Regression Approach for Modeling Games With Many Symmetric Players

摘要

We exploit player symmetry to formulate the representation of large normal-form games as a regression task. This formulation allows arbitrary regression methods to be employed in in estimating utility functions from a small subset of the game's outcomes. We demonstrate the applicability both neural networks and Gaussian process regression, but focus on the latter. Once utility functions are learned, computing Nash equilibria requires estimating expected payoffs of pure-strategy deviations from mixed-strategy profiles. Computing these expectations exactly requires an infeasible sum over the full payoff matrix, so we propose and test several approximation methods. Three of these are simple and generic, applicable to any regression method and games with any number of player roles. However, the best performance is achieved by a continuous integral that approximates the summation, which we formulate for the specific case of fully-symmetric games learned by Gaussian process regression with a radial basis function kernel. We demonstrate experimentally that the combination of learned utility functions and expected payoff estimation allows us to efficiently identify approximate equilibria of large games using sparse payoff data.

关键词

game theory; equilibrium; regression

全文

PDF


Equilibrium Computation and Robust Optimization in Zero Sum Games With Submodular Structure

摘要

We define a class of zero-sum games with combinatorial structure, where the best response problem of one player is to maximize a submodular function. For example, this class includes security games played on networks, as well as the problem of robustly optimizing a submodular function over the worst case from a set of scenarios. The challenge in computing equilibria is that both players' strategy spaces can be exponentially large. Accordingly, previous algorithms have worst-case exponential runtime and indeed fail to scale up on practical instances. We provide a pseudopolynomial-time algorithm which obtains a guaranteed (1 - 1/e)^2-approximate mixed strategy for the maximizing player. Our algorithm only requires access to a weakened version of a best response oracle for the minimizing player which runs in polynomial time. Experimental results for network security games and a robust budget allocation problem confirm that our algorithm delivers near-optimal solutions and scales to much larger instances than was previously possible.

关键词

Submodularity; robust optimization; equilibrium computation

全文

PDF


Incentive-Compatible Forecasting Competitions

摘要

We consider the design of forecasting competitions in which multiple forecasters make predictions about one or more independent events and compete for a single prize. We have two objectives: (1) to award the prize to the most accurate forecaster, and (2) to incentivize forecasters to report truthfully, so that forecasts are informative and forecasters need not spend any cognitive effort strategizing about reports. Proper scoring rules incentivize truthful reporting if all forecasters are paid according to their scores. However, incentives become distorted if only the best-scoring forecaster wins a prize, since forecasters can often increase their probability of having the highest score by reporting extreme beliefs. Even if forecasters do report truthfully, awarding the prize to the forecaster with highest score does not guarantee that high-accuracy forecasters are likely to win; in extreme cases, it can result in a perfect forecaster having zero probability of winning. In this paper, we introduce a truthful forecaster selection mechanism. We lower-bound the probability that our mechanism selects the most accurate forecaster, and give rates for how quickly this bound approaches 1 as the number of events grows. Our techniques can be generalized to the related problems of outputting a ranking over forecasters and hiring a forecaster with high accuracy on future events.

关键词

forecasting; incentives; expert identification; proper scoring rules

全文

PDF


Strategic Coordination of Human Patrollers and Mobile Sensors With Signaling for Security Games

摘要

Traditional security games concern the optimal randomized allocation of human patrollers, who can directly catch attackers or interdict attacks. Motivated by the emerging application of utilizing mobile sensors (e.g., UAVs) for patrolling, in this paper we propose the novel Sensor-Empowered security Game (SEG) model which captures the joint allocation of human patrollers and mobile sensors. Sensors differ from patrollers in that they cannot directly interdict attacks, but they can notify nearby patrollers (if any). Moreover, SEGs incorporate mobile sensors' natural functionality of strategic signaling. On the technical side, we first prove that solving SEGs is NP-hard even in zero-sum cases. We then develop a scalable algorithm SEGer based on the branch-and-price framework with two key novelties: (1) a novel MILP formulation for the slave; (2) an efficient relaxation of the problem for pruning. To further accelerate SEGer, we design a faster combinatorial algorithm for the slave problem, which is provably a constant-approximation to the slave problem in zero-sum cases and serves as a useful heuristic for general-sum SEGs. Our experiments demonstrate the significant benefit of utilizing mobile sensors.

关键词

Security Games; Sensors; Signaling; Deception

全文

PDF


Average-Case Approximation Ratio of Scheduling Without Payments

摘要

Apart from the principles and methodologies inherited from Economics and Game Theory, the studies in Algorithmic Mechanism Design typically employ the worst-case analysis and approximation schemes of Theoretical Computer Science. For instance, the approximation ratio, which is the canonical measure of evaluating how well an incentive-compatible mechanism approximately optimizes the objective, is defined in the worst-case sense. It compares the performance of the optimal mechanism against the performance of a truthful mechanism, for all possible inputs. In this paper, we take the average-case analysis approach, and tackle one of the primary motivating problems in Algorithmic Mechanism Design -- the scheduling problem [Nisan and Ronen 1999]. One version of this problem which includes a verification component is studied by [Koutsoupias 2014]. It was shown that the problem has a tight approximation ratio bound of (n+1)/2 for the single-task setting, where n is the number of machines. We show, however, when the costs of the machines to executing the task follow any independent and identical distribution, the average-case approximation ratio of the mechanism given in [Koutsoupias 2014] is upper bounded by a constant. This positive result asymptotically separates the average-case ratio from the worst-case ratio, and indicates that the optimal mechanism for the problem actually works well on average, although in the worst-case the expected cost of the mechanism is Theta(n) times that of the optimal cost.

关键词

全文

PDF


Avoiding Dead Ends in Real-Time Heuristic Search

摘要

Many systems, such as mobile robots, need to be controlled in real time. Real-time heuristic search is a popular on-line planning paradigm that supports concurrent planning and execution. However,existing methods do not incorporate a notion of safety and we show that they can perform poorly in domains that contain dead-end states from which a goal cannot be reached. We introduce new real-time heuristic search methods that can guarantee safety if the domain obeys certain properties. We test these new methods on two different simulated domains that contain dead ends, one that obeys the properties and one that does not. We find that empirically the new methods provide good performance. We hope this work encourages further efforts to widen the applicability of real-time planning.

关键词

heuristic search, real-time search, safety

全文

PDF


Efficiently Monitoring Small Data Modification Effect for Large-Scale Learning in Changing Environment

摘要

We study large-scale machine learning problems in changing environments where a small part of the dataset is modified, and the effect of the data modification must be monitored in order to know how much the modification changes the optimal model. When the entire dataset is large, even if the amount of the data modification is fairly small, the computational cost for re-training the model would be prohibitively large. In this paper, we propose a novel method, called the optimal solution bounding (OSB), for monitoring such a data modification effect on the optimal model by efficiently evaluating (without actually re-training) it. The proposed method provides bounds on the unknown optimal model with the cost proportional only to the size of the data modification.

关键词

全文

PDF


A Recursive Scenario Decomposition Algorithm for Combinatorial Multistage Stochastic Optimisation Problems

摘要

Stochastic programming is concerned with decision making under uncertainty, seeking an optimal policy with respect to a set of possible future scenarios. This paper looks at multistage decision problems where the uncertainty is revealed over time. First, decisions are made with respect to all possible future scenarios. Secondly, after observing the random variables, a set of scenario specific decisions is taken. Our goal is to develop algorithms that can be used as a back-end solver for high-level modeling languages. In this paper we propose a scenario decomposition method to solve multistage stochastic combinatorial decision problems recursively. Our approach is applicable to general problem structures, utilizes standard solving technology and is highly parallelizable. We provide experimental results to show how it efficiently solves benchmarks with hundreds of scenarios.

关键词

Computer Science; Stochastic Programming; Constraint Programming

全文

PDF


Locality Preserving Projection Based on F-norm

摘要

Locality preserving projection (LPP) is a well-known method for dimensionality reduction in which the neighborhood graph structure of data is preserved. Traditional LPP employ squared F-norm for distance measurement. This may exaggerate more distance errors, and result in a model being sensitive to outliers. In order to deal with this issue, we propose two novel F-norm-based models, termed as F-LPP and F-2DLPP, which are developed for vector-based and matrix-based data, respectively. In F-LPP and F-2DLPP, the distance of data projected to a low dimensional space is measured by F-norm. Thus it is anticipated that both methods can reduce the influence of outliers. To solve the F-norm-based models, we propose an iterative optimization algorithm, and give the convergence analysis of algorithm. The experimental results on three public databases have demonstrated the effectiveness of our proposed methods.

关键词

全文

PDF


A Two-Stage MaxSAT Reasoning Approach for the Maximum Weight Clique Problem

摘要

MaxSAT reasoning is an effective technology used in modern branch-and-bound (BnB) algorithms for the Maximum Weight Clique problem (MWC) to reduce the search space. However, the current MaxSAT reasoning approach for MWC is carried out in a blind manner and is not guided by any relevant strategy. In this paper, we describe a new BnB algorithm for MWC that incorporates a novel two-stage MaxSAT reasoning approach. In each stage, the MaxSAT reasoning is specialised and guided for different tasks. Experiments on an extensive set of graphs show that the new algorithm implementing this approach significantly outperforms relevant exact and heuristic MWC algorithms in both small/medium and massive real-world graphs.

关键词

Maximum Weight Clique Problem; MaxSAT Reasoning; Branch-and-Bound

全文

PDF


Revisiting Immediate Duplicate Detection in External Memory Search

摘要

External memory search algorithms store the open and closed lists in secondary memory (e.g., hard disks) to augment limited internal memory. To minimize expensive random access in hard disks, these algorithms typically employ delayed duplicate detection (DDD), at the expense of processing more nodes than algorithms using immediate duplicate detection (IDD). Given the recent ubiquity of solid state drives (SSDs), we revisit the use of IDD in external memory search. We propose segmented compression, an improved IDD method that significantly reduces the number of false positive access into secondary memory. We show that A*-IDD, an external search variant of A* that uses segmented compression-based IDD, significantly improves upon previous open-addressing based IDD. We also show that A*-IDD can outperform DDD-based A* on some domains in domain-independent planning.

关键词

Heuristic Search; External Memory Search; Immediate Duplicate Detection; Delayed Duplicate Detection; SSD

全文

PDF


Warmstarting of Model-Based Algorithm Configuration

摘要

The performance of many hard combinatorial problem solvers depends strongly on their parameter settings, and since manual parameter tuning is both tedious and suboptimal the AI community has recently developed several algorithm configuration (AC) methods to automatically address this problem. While all existing AC methods start the configuration process of an algorithm A from scratch for each new type of benchmark instances, here we propose to exploit information about A's performance on previous benchmarks in order to warmstart its configuration on new types of benchmarks. We introduce two complementary ways in which we can exploit this information to warmstart AC methods based on a predictive model. Experiments for optimizing a flexible modern SAT solver on twelve different instance sets show that our methods often yield substantial speedups over existing AC methods (up to 165-fold) and can also find substantially better configurations given the same compute budget.

关键词

Algorithm Configuration; Parameter Tuning; Empirical Algorithmics

全文

PDF


On the Time and Space Complexity of Genetic Programming for Evolving Boolean Conjunctions

摘要

Genetic Programming (GP) is a general purpose bio-inspired meta-heuristic for the evolution of computer programs. In contrast to the several successful applications, there is little understanding of the working principles behind GP. In this paper we present a performance analysis that sheds light on the behaviour of simple GP systems for evolving conjunctions of n variables (AND_n). The analysis of a random local search GP system with minimal terminal and function sets reveals the relationship between the number of iterations and the expected error of the evolved program on the complete training set. Afterwards we consider a more realistic GP system equipped with a global mutation operator and prove that it can efficiently solve AND_n by producing programs of linear size that fit a training set to optimality and with high probability generalise well. Additionally, we consider more general problems which extend the terminal set with undesired variables or negated variables. In the presence of undesired variables, we prove that, if non-strict selection is used, then the algorithm fits the complete training set efficiently while the strict selection algorithm may fail with high probability unless the substitution operator is switched off. In the presence of negations, we show that while the algorithms fail to fit the complete training set, the constructed solutions generalise well. Finally, from a problem hardness perspective, we reveal the existence of small training sets that allow the evolution of the exact conjunctions even in the presence of negations or of undesired variables.

关键词

Genetic programming; Runtime analysis; Computational complexity

全文

PDF


Proximal Alternating Direction Network: A Globally Converged Deep Unrolling Framework

摘要

Deep learning models have gained great success in many real-world applications. However, most existing networks are typically designed in heuristic manners, thus lack of rigorous mathematical principles and derivations. Several recent studies build deep structures by unrolling a particular optimization model that involves task information. Unfortunately, due to the dynamic nature of network parameters, their resultant deep propagation networks do not possess the nice convergence property as the original optimization scheme does. This paper provides a novel proximal unrolling framework to establish deep models by integrating experimentally verified network architectures and rich cues of the tasks. More importantly,we prove in theory that 1) the propagation generated by our unrolled deep model globally converges to a critical-point of a given variational energy, and 2) the proposed framework is still able to learn priors from training data to generate a convergent propagation even when task information is only partially available. Indeed, these theoretical results are the best we can ask for, unless stronger assumptions are enforced. Extensive experiments on various real-world applications verify the theoretical convergence and demonstrate the effectiveness of designed deep models.

关键词

Proximal alternating direction; Deep Network; Convergence; Unrolling framework

全文

PDF


Streaming Non-Monotone Submodular Maximization: Personalized Video Summarization on the Fly

摘要

The need for real time analysis of rapidly producing data streams (e.g., video and image streams) motivated the design of streaming algorithms that can efficiently extract and summarize useful information from massive data "on the fly." Such problems can often be reduced to maximizing a submodular set function subject to various constraints. While efficient streaming methods have been recently developed for monotone submodular maximization, in a wide range of applications, such as video summarization, the underlying utility function is non-monotone, and there are often various constraints imposed on the optimization problem to consider privacy or personalization. We develop the first efficient single pass streaming algorithm, Streaming Local Search, that for any streaming monotone submodular maximization algorithm with approximation guarantee α under a collection of independence systems I, provides a constant 1/(1+2/√α+1/α+2d(1+√α)) approximation guarantee for maximizing a non-monotone submodular function under the intersection of I and d knapsack constraints. Our experiments show that for video summarization, our method runs more than 1700 times faster than previous work, while maintaining practically the same performance.

关键词

Submodular Maximization; Streaming Algorithms; Data Summarization; Video Summarization

全文

PDF


Exact Clustering via Integer Programming and Maximum Satisfiability

摘要

We consider the following general graph clustering problem: given a complete undirected graph G=(V,E,c) with an edge weight function c:E->Q, we are asked to find a partition C of V that maximizes the sum of edge weights within the clusters in C. Owing to its high generality, this problem has a wide variety of real-world applications, including correlation clustering, group technology, and community detection. In this study, we investigate the design of mathematical programming formulations and constraint satisfaction formulations for the problem. First, we present a novel integer linear programming (ILP) formulation that has far fewer constraints than the standard ILP formulation by Groetschel and Wakabayashi (1989). Second, we propose an ILP-based exact algorithm that solves an ILP problem obtained by modifying our above ILP formulation and then performs simple post-processing to produce an optimal solution to the original problem. Third, we present maximum satisfiability (MaxSAT) counterparts of both our ILP formulation and ILP-based exact algorithm. Computational experiments using well-known real-world datasets demonstrate that our ILP-based approaches and their MaxSAT counterparts are highly effective in terms of both memory efficiency and computation time.

关键词

clustering; exact algorithms; mathematical programming; integer linear programming; constraint satisfaction; maximum satisfiability

全文

PDF


On Multiset Selection With Size Constraints

摘要

This paper considers the multiset selection problem with size constraints, which arises in many real-world applications such as budget allocation. Previous studies required the objective function f to be submodular, while we relax this assumption by introducing the notion of the submodularity ratios (denoted by α_f and β_f). We propose an anytime randomized iterative approach POMS, which maximizes the given objective f and minimizes the multiset size simultaneously. We prove that POMS using a reasonable time achieves an approximation guarantee of max{1-1/e^(β_f), (α_f/2)(1-1/e^(α_f))}. Particularly, when f is submdoular, this bound is at least as good as that of the previous greedy-style algorithms. In addition, we give lower bounds on the submodularity ratio for the objectives of budget allocation. Experimental results on budget allocation as well as a more complex application, namely, generalized influence maximization, exhibit the superior performance of the proposed approach.

关键词

全文

PDF


Disjunctive Program Synthesis: A Robust Approach to Programming by Example

摘要

Programming by example (PBE) systems allow end users to easily create programs by providing a few input-output examples to specify their intended task. The system attempts to generate a program in a domain specific language (DSL) that satisfies the given examples. However, a key challenge faced by existing PBE techniques is to ensure the robustness of the programs that are synthesized from a small number of examples, as these programs often fail when applied to new inputs. This is because there can be many possible programs satisfying a small number of examples, and the PBE system has to somehow rank between these candidates and choose the correct one without any further information from the user. In this work we present a different approach to PBE in which the system avoids making a ranking decision at the synthesis stage, by instead synthesizing a disjunctive program that includes the many top-ranked programs as possible alternatives and selects between these different choices upon execution on a new input. This delayed choice brings the important benefit of comparing the possible outputs produced by the different disjuncts on a given input at execution time. We present a generic framework for synthesizing such disjunctive programs in arbitrary DSLs, and describe two concrete implementations of disjunctive synthesis in the practical domains of data extraction from plain text and HTML documents. We present an evaluation showing the significant increase in robustness achieved with our disjunctive approach, as illustrated by an increase from 59% to 93% of tasks for which correct programs can be learnt from a single example.

关键词

program synthesis; programming by example;

全文

PDF


Accelerated Best-First Search With Upper-Bound Computation for Submodular Function Maximization

摘要

Submodular maximization continues to be an attractive subject of study thanks to its applicability to many real-world problems. Although greedy-based methods are guaranteed to find (1-1/e)-approximate solutions for monotone submodular maximization, many applications require solutions with better approximation guarantees; moreover, it is desirable to be able to control the trade-off between the computation time and approximation guarantee. Given this background, the best-first search (BFS) has been recently studied as a promising approach. However, existing BFS-based methods for submodular maximization sometimes suffer excessive computation cost since their heuristic functions are not well designed. In this paper, we propose an accelerated BFS for monotone submodular maximization with a knapsack constraint. The acceleration is attained by introducing a new termination condition and developing a novel method for computing an upper-bound of the optimal value for submodular maximization, which enables us to use a better heuristic function. Experiments show that our accelerated BFS is far more efficient in terms of both time and space complexities than existing methods.

关键词

submodular function; best-first search

全文

PDF


Submodular Function Maximization Over Graphs via Zero-Suppressed Binary Decision Diagrams

摘要

Submodular function maximization (SFM) has attracted much attention thanks to its applicability to various practical problems. Although most studies have considered SFM with size or budget constraints, more complex constraints often appear in practice. In this paper, we consider a very general class of SFM with such complex constraints (e.g., an s-t path constraint on a given graph). We propose a novel algorithm that takes advantage of zero-suppressed binary decision diagrams, which store all feasible solutions efficiently thus enabling us to circumvent the difficulty of determining feasibility. Theoretically, our algorithm is guaranteed to achieve (1-c)-approximations, where c is the curvature of a submodular function. Experiments show that our algorithm runs much faster than exact algorithms and finds better solutions than those obtained by an existing approximation algorithm in many instances. Notably, our algorithm achieves better than a 90%-approximation in all instances for which optimal values are available.

关键词

submodular function; graph; zero-suppressed binary decision diagram

全文

PDF


Counting Linear Extensions in Practice: MCMC Versus Exponential Monte Carlo

摘要

Counting the linear extensions of a given partial order is a #P-complete problem that arises in numerous applications. For polynomial-time approximation, several Markov chain Monte Carlo schemes have been proposed; however, little is known of their efficiency in practice. This work presents an empirical evaluation of the state-of-the-art schemes and investigates a number of ideas to enhance their performance. In addition, we introduce a novel approximation scheme, adaptive relaxation Monte Carlo (ARMC), that leverages exact exponential-time counting algorithms. We show that approximate counting is feasible up to a few hundred elements on various classes of partial orders, and within this range ARMC typically outperforms the other schemes.

关键词

partial order; linear extension; counting; Monte Carlo; MCMC; practice

全文

PDF


Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioning

摘要

In this paper, we revisit the large-scale constrained linear regression problem and propose faster methods based on some recent developments in sketching and optimization. Our algorithms combine (accelerated) mini-batch SGD with a new method called two-step preconditioning to achieve an approximate solution with a time complexity lower than that of the state-of-the-art techniques for the low precision case. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity. Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.

关键词

Linear Regression

全文

PDF


Noisy Derivative-Free Optimization With Value Suppression

摘要

Derivative-free optimization has shown advantage in solving sophisticated problems such as policy search, when the environment is noise-free. Many real-world environments are noisy, where solution evaluations are inaccurate due to the noise. Noisy evaluation can badly injure derivative-free optimization, as it may make a worse solution looks better. Sampling is a straightforward way to reduce noise, while previous studies have shown that delay the noise handling to the comparison time point (i.e., threshold selection) can be helpful for derivative-free optimization. This work further delays the noise handling, and proposes a simple noise handling mechanism, i.e., value suppression. By value suppression, we do nothing about noise until the best-so-far solution has not been improved for a period, and then suppress the value of the best-so-far solution and continue the optimization. On synthetic problems as well as reinforcement learning tasks, experiments verify that value suppression can be significantly more effective than the previous methods.

关键词

Derivative-free Optimization; Noisy Environment; Direct Policy Search; Value Suppression

全文

PDF


Memory-Augmented Monte Carlo Tree Search

摘要

This paper proposes and evaluates Memory-Augmented Monte Carlo Tree Search (M-MCTS), which provides a new approach to exploit generalization in online real-time search. The key idea of M-MCTS is to incorporate MCTS with a memory structure, where each entry contains information of a particular state. This memory is used to generate an approximate value estimation by combining the estimations of similar states. We show that the memory based value approximation is better than the vanilla Monte Carlo estimation with high probability under mild conditions. We evaluate M-MCTS in the game of Go. Experimental results show that M-MCTS outperforms the original MCTS with the same number of simulations.

关键词

Monte Carlo tree search; Memory; Value function estimation

全文

PDF


A Coverage-Based Utility Model for Identifying Unknown Unknowns

摘要

A classifier’s low confidence in prediction is often indicative of whether its prediction will be wrong; in this case, inputs are called known unknowns. In contrast, unknown unknowns (UUs) are inputs on which a classifier makes a high confidence mistake. Identifying UUs is especially important in safety-critical domains like medicine (diagnosis) and law (recidivism prediction). Previous work by Lakkaraju et al. (2017) on identifying unknown unknowns assumes that the utility of each revealed UU is independent of the others, rather than considering the set holistically. While this assumption yields an efficient discovery algorithm, we argue that it produces an incomplete understanding of the classifier’s limitations. In response, this paper proposes a new class of utility models that rewards how well the discovered UUs cover (or "explain") a sample distribution of expected queries. Although choosing an optimal cover is intractable, even if the UUs were known, our utility model is monotone submodular, affording a greedy discovery strategy. Experimental results on four datasets show that our method outperforms bandit-based approaches and achieves within 60.9% utility of an omniscient, tractable upper bound.

关键词

全文

PDF


Toward Deep Reinforcement Learning Without a Simulator: An Autonomous Steering Example

摘要

We propose a scheme for training a computerized agent to perform complex human tasks such as highway steering. The scheme is designed to follow a natural learning process whereby a human instructor teaches a computerized trainee. It enables leveraging the weak supervision abilities of a (human) instructor, who, while unable to perform well herself at the required task, can provide coherent and learnable instantaneous reward signals to the computerized trainee. The learning process consists of three supervised elements followed by reinforcement learning. The supervised learning stages are: (i) supervised imitation learning; (ii) supervised reward induction; and (iii) supervised safety module construction. We implemented this scheme using deep convolutional networks and applied it to successfully create a computerized agent capable of autonomous highway steering over the well-known racing game Assetto Corsa. We demonstrate that the use of all components is essential to effectively carry out reinforcement learning of the steering task using vision alone, without access to a driving simulator internals, and operating in wall-clock time.

关键词

Autonomous Driving; Deep Learning; Reinforcement Learning; Supervised Learning;

全文

PDF


An Interactive Multi-Label Consensus Labeling Model for Multiple Labeler Judgments

摘要

Multi-label classification is crucial to several practical applications including document categorization, video tagging, targeted advertising etc. Training a multi-label classifier requires a large amount of labeled data which is often unavailable or scarce. Labeled data is then acquired by consulting multiple labelers---both human and machine. Inspired by ensemble methods, our premise is that labels inferred with high consensus among labelers, might be closer to the ground truth. We propose strategies based on interaction and active learning to obtain higher quality labels that potentially lead to greater consensus. We propose a novel formulation that aims to collectively optimize the cost of labeling, labeler reliability, label-label correlation and inter-labeler consensus. Evaluation on data labeled by multiple labelers (both human and machine) shows that our consensus output is closer to the ground truth when compared to the "majority" baseline. We present illustrative cases where it even improves over the existing ground truth. We also present active learning strategies to leverage our consensus model in interactive learning settings. Experiments on several real-world datasets (publicly available) demonstrate the efficacy of our approach in achieving promising classification results with fewer labeled data.

关键词

consensus; active learning; interactive

全文

PDF


Interactively Learning a Blend of Goal-Based and Procedural Tasks

摘要

Agents that can learn new tasks through interactive instruction can utilize goal information to search for and learn flexible policies. This approach can be resilient to variations in initial conditions or issues that arise during execution. However, if a task is not easily formulated as achieving a goal or if the agent lacks sufficient domain knowledge for planning, other methods are required. We present a hybrid approach to interactive task learning that can learn both goal-oriented and procedural tasks, and mixtures of the two, from human natural language instruction. We describe this approach, go through two examples of learning tasks, and outline the space of tasks that the system can learn. We show that our approach can learn a variety of goal-oriented and procedural tasks from a single example and is robust to different amounts of domain knowledge.

关键词

Interactive Task Learning; Human Agent Collaboration; Cognitive Robotics

全文

PDF


Emergence of Grounded Compositional Language in Multi-Agent Populations

摘要

By capturing statistical patterns in large corpora, machine learning has enabled significant advances in natural language processing, including in machine translation, question answering, and sentiment analysis. However, for agents to intelligently interact with humans, simply capturing the statistical patterns is insufficient. In this paper we investigate if, and how, grounded compositional language can emerge as a means to achieve goals in multi-agent populations. Towards this end, we propose a multi-agent learning environment and learning methods that bring about emergence of a basic compositional language. This language is represented as streams of abstract discrete symbols uttered by agents over time, but nonetheless has a coherent structure that possesses a defined vocabulary and syntax. We also observe emergence of non-verbal communication such as pointing and guiding when language communication is unavailable.

关键词

Humans and AI; Multiagent Communication; Language Emergence

全文

PDF


Human-in-the-Loop SLAM

摘要

Building large-scale, globally consistent maps is a challenging problem, made more difficult in environments with limited access, sparse features, or when using data collected by novice users. For such scenarios, where state-of-the-art mapping algorithms produce globally inconsistent maps, we introduce a systematic approach to incorporating sparse human corrections, which we term Human-in-the-Loop Simultaneous Localization and Mapping (HitL-SLAM). Given an initial factor graph for pose graph SLAM, HitL-SLAM accepts approximate, potentially erroneous, and rank-deficient human input, infers the intended correction via expectation maximization (EM), back-propagates the extracted corrections over the pose graph, and finally jointly optimizes the factor graph including the human inputs as human correction factor terms, to yield globally consistent large-scale maps. We thus contribute an EM formulation for inferring potentially rank-deficient human corrections to mapping, and human correction factor extensions to the factor graphs for pose graph SLAM that result in a principled approach to joint optimization of the pose graph while simultaneously accounting for multiple forms of human correction. We present empirical results showing the effectiveness of HitL-SLAM at generating globally accurate and consistent maps even when given poor initial estimates of the map.

关键词

Metric Mapping; Factor graph; Human Augmented Mapping

全文

PDF


An Interpretable Joint Graphical Model for Fact-Checking From Crowds

摘要

Assessing the veracity of claims made on the Internet is an important, challenging, and timely problem. While automated fact-checking models have potential to help people better assess what they read, we argue such models must be explainable, accurate, and fast to be useful in practice; while prediction accuracy is clearly important, model transparency is critical in order for users to trust the system and integrate their own knowledge with model predictions. To achieve this, we propose a novel probabilistic graphical model (PGM) which combines machine learning with crowd annotations. Nodes in our model correspond to claim veracity, article stance regarding claims, reputation of news sources, and annotator reliabilities. We introduce a fast variational method for parameter estimation. Evaluation across two real-world datasets and three scenarios shows that: (1) joint modeling of sources, claims and crowd annotators in a PGM improves the predictive performance and interpretability for predicting claim veracity; and (2) our variational inference method achieves scalably fast parameter estimation, with only modest degradation in performance compared to Gibbs sampling. Regarding model transparency, we designed and deployed a prototype fact-checker Web tool, including a visual interface for explaining model predictions. Results of a small user study indicate that model explanations improve user satisfaction and trust in model predictions. We share our web demo, model source code, and the 13K crowd labels we collected.

关键词

grahphical models; variational inference; crowdsourcing; natural language processing

全文

PDF


How AI Wins Friends and Influences People in Repeated Games With Cheap Talk

摘要

Research has shown that a person's financial success is more dependent on the ability to deal with people than on professional knowledge. Sage advice, such as "if you can't say something nice, don't say anything at all" and principles articulated in Carnegie's classic "How to Win Friends and Influence People," offer trusted rules-of-thumb for how people can successfully deal with each other. However, alternative philosophies for dealing with people have also emerged. The success of an AI system is likewise contingent on its ability to win friends and influence people. In this paper, we study how AI systems should be designed to win friends and influence people in repeated games with cheap talk (RGCTs). We create several algorithms for playing RGCTs by combining existing behavioral strategies (what the AI does) with signaling strategies (what the AI says) derived from several competing philosophies. Via user study, we evaluate these algorithms in four RGCTs. Our results suggest sufficient properties for AIs to win friends and influence people in RGCTs.

关键词

Repeated games; Human-machine collaboration

全文

PDF


Anchors: High-Precision Model-Agnostic Explanations

摘要

We introduce a novel model-agnostic system that explains the behavior of complex models with high-precision rules called anchors, representing local, "sufficient" conditions for predictions. We propose an algorithm to efficiently compute these explanations for any black-box model with high-probability guarantees. We demonstrate the flexibility of anchors by explaining a myriad of different models for different domains and tasks. In a user study, we show that anchors enable users to predict how a model would behave on unseen instances with less effort and higher precision, as compared to existing linear explanations or no explanations.

关键词

machine learning;interpretability

全文

PDF


Optimizing Interventions via Offline Policy Evaluation: Studies in Citizen Science

摘要

Volunteers who help with online crowdsourcing such as citizen science tasks typically make only a few contributions before exiting. We propose a computational approach for increasing users' engagement in such settings that is based on optimizing policies for displaying motivational messages to users. The approach, which we refer to as Trajectory Corrected Intervention (TCI), reasons about the tradeoff between the long-term influence of engagement messages on participants' contributions and the potential risk of disrupting their current work. We combine model-based reinforcement learning with off-line policy evaluation to generate intervention policies, without relying on a fixed representation of the domain. TCI works iteratively to learn the best representation from a set of random intervention trials and to generate candidate intervention policies. It is able to refine selected policies off-line by exploiting the fact that users can only be interrupted once per session.We implemented TCI in the wild with Galaxy Zoo, one of the largest citizen science platforms on the web. We found that TCI was able to outperform the state-of-the-art intervention policy for this domain, and significantly increased the contributions of thousands of users. This work demonstrates the benefit of combining traditional AI planning with off-line policy methods to generate intelligent intervention strategies.

关键词

Citizen Science; Off-line Policy Evaluation

全文

PDF


Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

摘要

While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot oftraining data. One way to increase the speed at which agent sare able to learn to perform tasks is by leveraging the input of human trainers. Although such input can take many forms, real-time, scalar-valued feedback is especially useful in situations where it proves difficult or impossible for humans to provide expert demonstrations. Previous approaches have shown the usefulness of human input provided in this fashion (e.g., the TAMER framework), but they have thus far not considered high-dimensional state spaces or employed the use of deep learning. In this paper, we do both: we propose DeepTAMER, an extension of the TAMER framework that leverages the representational power of deep neural networks inorder to learn complex tasks in just a short amount of time with a human trainer. We demonstrate Deep TAMER’s success by using it and just 15 minutes of human-provided feedback to train an agent that performs better than humans on the Atari game of Bowling - a task that has proven difficult for even state-of-the-art reinforcement learning methods.

关键词

全文

PDF


Semi-Supervised Learning From Crowds Using Deep Generative Models

摘要

Although supervised learning requires a labeled dataset, obtaining labels from experts is generally expensive. For this reason, crowdsourcing services are attracting attention in the field of machine learning as a way to collect labels at relatively low cost. However, the labels obtained by crowdsourcing, i.e., from non-expert workers, are often noisy. A number of methods have thus been devised for inferring true labels, and several methods have been proposed for learning classifiers directly from crowdsourced labels, referred to as "learning from crowds." A more practical problem is learning from crowdsourced labeled data and unlabeled data, i.e., "semi-supervised learning from crowds." This paper presents a novel generative model of the labeling process in crowdsourcing. It leverages unlabeled data effectively by introducing latent features and a data distribution. Because the data distribution can be complicated, we use a deep neural network for the data distribution. Therefore, our model can be regarded as a kind of deep generative model. The problems caused by the intractability of latent variable posteriors is solved by introducing an inference model. The experiments show that it outperforms four existing models, including a baseline model, on the MNIST dataset with simulated workers and the Rotten Tomatoes movie review dataset with Amazon Mechanical Turk workers.

关键词

Crowdsourcing; Semi-supervised Learning; Deep Learning

全文

PDF


Sentiment Analysis via Deep Hybrid Textual-Crowd Learning Model

摘要

Crowdsourcing technique provides an efficient platform to employ human skills in sentiment analysis, which is a difficult task for automatic language models due to the large variations in context, writing style, view point and so on. However, the standard crowdsourcing aggregation models are incompetent when the number of crowd labels per worker is not sufficient to train parameters, or when it is not feasible to collect labels for each sample in a large dataset. In this paper, we propose a novel hybrid model to exploit both crowd and text data for sentiment analysis, consisting of a generative crowdsourcing aggregation model and a deep sentimental autoencoder. Combination of these two sub-models is obtained based on a probabilistic framework rather than a heuristic way. We introduce a unified objective function to incorporate the objectives of both sub-models, and derive an efficient optimization algorithm to jointly solve the corresponding problem. Experimental results indicate that our model achieves superior results in comparison with the state-of-the-art models, especially when the crowd labels are scarce.

关键词

Sentiment Analysis; Deep Learning; Autoencoder; Crowd Label Aggregation

全文

PDF


Understanding Over Participation in Simple Contests

摘要

One key motivation for using contests in real-life is the substantial evidence reported in empirical contest-design literature for people's tendency to act more competitively in contests than predicted by the Nash Equilibrium. This phenomenon has been traditionally explained by people's eagerness to win and maximize their relative (rather than absolute) payoffs. In this paper we make use of "simple contests," where contestants only need to strategize on whether to participate in the contest or not, as an infrastructure for studying whether indeed more effort is exerted in contests due to competitiveness, or perhaps this can be attributed to other factors that hold also in non-competitive settings. The experimental methodology we use compares contestants' participation decisions in eight contest settings differing in the nature of the contest used, the number of contestants used and the theoretical participation predictions to those obtained (whenever applicable) by subjects facing equivalent non-competitive decision situations in the form of a lottery. We show that indeed people tend to over-participate in contests compared to the theoretical predictions, yet the same phenomenon holds (to a similar extent) also in the equivalent non-competitive settings. Meaning that many of the contests used nowadays as a means for inducing extra human effort, that are often complex to organize and manage, can be replaced by a simpler non-competitive mechanism that uses probabilistic prizes.

关键词

全文

PDF


Understanding Social Interpersonal Interaction via Synchronization Templates of Facial Events

摘要

Automatic facial expression analysis in inter-personal communication is challenging. Not only because conversation partners' facial expressions mutually influence each other, but also because no correct interpretation of facial expressions is possible without taking social context into account. In this paper, we propose a probabilistic framework to model interactional synchronization between conversation partners based on their facial expressions. Interactional synchronization manifests temporal dynamics of conversation partners' mutual influence. In particular, the model allows us to discover a set of common and unique facial synchronization templates directly from natural interpersonal interaction without recourse to any predefined labeling schemes. The facial synchronization templates represent periodical facial event coordinations shared by multiple conversation pairs in a specific social context. We test our model on two different dyadic conversations of negotiation and job-interview. Based on the discovered facial event coordination, we are able to predict their conversation outcomes with higher accuracy than HMMs and GMMs.

关键词

Facial expression; Social interaction; Interactional synchrony; Video conferencing; Coupled hidden Markov model; Beta-Bernoulli process; Bayes nonparametrics; Gibbs sampling

全文

PDF


A Voting-Based System for Ethical Decision Making

摘要

We present a general approach to automating ethical decisions, drawing on machine learning and computational social choice. In a nutshell, we propose to learn a model of societal preferences, and, when faced with a specific ethical dilemma at runtime, efficiently aggregate those preferences to identify a desirable choice. We provide a concrete algorithm that instantiates our approach; some of its crucial steps are informed by a new theory of swap-dominance efficient voting rules. Finally, we implement and evaluate a system for ethical decision making in the autonomous vehicle domain, using preference data collected from 1.3 million people through the Moral Machine website.

关键词

全文

PDF


Partial Truthfulness in Minimal Peer Prediction Mechanisms With Limited Knowledge

摘要

We study minimal single-task peer prediction mechanisms that have limited knowledge about agents' beliefs. Without knowing what agents' beliefs are or eliciting additional information, it is not possible to design a truthful mechanism in a Bayesian-Nash sense. We go beyond truthfulness and explore equilibrium strategy profiles that are only partially truthful. Using the results from the multi-armed bandit literature, we give a characterization of how inefficient these equilibria are comparing to truthful reporting. We measure the inefficiency of such strategies by counting the number of dishonest reports that any minimal knowledge-bounded mechanism must have. We show that the order of this number is θ(log n), where n is the number of agents, and we provide a peer prediction mechanism that achieves this bound in expectation.

关键词

全文

PDF


Information Gathering With Peers: Submodular Optimization With Peer-Prediction Constraints

摘要

We study a problem of optimal information gathering from multiple data providers that need to be incentivized to provide accurate information. This problem arises in many real world applications that rely on crowdsourced data sets, but where the process of obtaining data is costly. A notable example of such a scenario is crowd sensing. To this end, we formulate the problem of optimal information gathering as maximization of a submodular function under a budget constraint, where the budget represents the total expected payment to data providers. Contrary to the existing approaches, we base our payments on incentives for accuracy and truthfulness, in particular, peer prediction methods that score each of the selected data providers against its best peer, while ensuring that the minimum expected payment is above a given threshold. We first show that the problem at hand is hard to approximate within a constant factor that is not dependent on the properties of the payment function. However, for given topological and analytical properties of the instance, we construct two greedy algorithms, respectively called PPCGreedy and PPCGreedyIter, and establish theoretical bounds on their performance w.r.t. the optimal solution. Finally, we evaluate our methods using a realistic crowd sensing testbed.

关键词

全文

PDF


Deep Learning from Crowds

摘要

Over the last few years, deep learning has revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. However, as the size of supervised artificial neural networks grows, typically so does the need for larger labeled datasets. Recently, crowdsourcing has established itself as an efficient and cost-effective solution for labeling large sets of data in a scalable manner, but it often requires aggregating labels from multiple noisy contributors with different levels of expertise. In this paper, we address the problem of learning deep neural networks from crowds. We begin by describing an EM algorithm for jointly learning the parameters of the network and the reliabilities of the annotators. Then, a novel general-purpose crowd layer is proposed, which allows us to train deep neural networks end-to-end, directly from the noisy labels of multiple annotators, using only backpropagation. We empirically show that the proposed approach is able to internally capture the reliability and biases of different annotators and achieve new state-of-the-art results for various crowdsourced datasets across different settings, namely classification, regression and sequence labeling.

关键词

deep learning; crowdsourcing; multiple annotators; neural networks; image classification; text regression; sequence labelling

全文

PDF


AdaFlock: Adaptive Feature Discovery for Human-in-the-loop Predictive Modeling

摘要

Feature engineering is the key to successful application of machine learning algorithms to real-world data. The discovery of informative features often requires domain knowledge or human inspiration, and data scientists expend a certain amount of effort into exploring feature spaces. Crowdsourcing is considered a promising approach for allowing many people to be involved in feature engineering; however, there is a demand for a sophisticated strategy that enables us to acquire good features at a reasonable crowdsourcing cost. In this paper, we present a novel algorithm called AdaFlock to efficiently obtain informative features through crowdsourcing. AdaFlock is inspired by AdaBoost, which iteratively trains classifiers by increasing the weights of samples misclassified by previous classifiers. AdaFlock iteratively generates informative features; at each iteration of AdaFlock, crowdsourcing workers are shown samples selected according to the classification errors of the current classifiers and are asked to generate new features that are helpful for correctly classifying the given examples. The results of our experiments conducted using real datasets indicate that AdaFlock successfully discovers informative features with fewer iterations and achieves high classification accuracy.

关键词

全文

PDF


Adversarial Learning for Chinese NER From Crowd Annotations

摘要

To quickly obtain new labeled data, we can choose crowdsourcing as an alternative way at lower cost in a short time. But as an exchange, crowd annotations from non-experts may be of lower quality than those from experts. In this paper, we propose an approach to performing crowd annotation learning for Chinese Named Entity Recognition (NER) to make full use of the noisy sequence labels from multiple annotators. Inspired by adversarial learning, our approach uses a common Bi-LSTM and a private Bi-LSTM for representing annotator-generic and -specific information. The annotator-generic information is the common knowledge for entities easily mastered by the crowd. Finally, we build our Chinese NE tagger based on the LSTM-CRF model. In our experiments, we create two data sets for Chinese NER tasks from two domains. The experimental results show that our system achieves better scores than strong baseline systems.

关键词

Named Entity Recognition; Crowd Annotations

全文

PDF


Adapting a Kidney Exchange Algorithm to Align With Human Values

摘要

The efficient allocation of limited resources is a classical problem in economics and computer science. In kidney exchanges, a central market maker allocates living kidney donors to patients in need of an organ. Patients and donors in kidney exchanges are prioritized using ad-hoc weights decided on by committee and then fed into an allocation algorithm that determines who get what—and who does not. In this paper, we provide an end-to-end methodology for estimating weights of individual participant profiles in a kidney exchange. We first elicit from human subjects a list of patient attributes they consider acceptable for the purpose of prioritizing patients (e.g., medical characteristics, lifestyle choices, and so on). Then, we ask subjects comparison queries between patient profiles and estimate weights in a principled way from their responses. We show how to use these weights in kidney exchange market clearing algorithms. We then evaluate the impact of the weights in simulations and find that the precise numerical values of the weights we computed matter little, other than the ordering of profiles that they imply. However, compared to not prioritizing patients at all, there is a significant effect, with certain classes of patients being (de)prioritized based on the human-elicited value judgments.

关键词

Kidney exchange; Moral AI; Ethical AI

全文

PDF


State of the Art: Reproducibility in Artificial Intelligence

摘要

Background: Research results in artificial intelligence (AI) are criticized for not being reproducible. Objective: To quantify the state of reproducibility of empirical AI research using six reproducibility metrics measuring three different degrees of reproducibility. Hypotheses: 1) AI research is not documented well enough to reproduce the reported results. 2) Documentation practices have improved over time. Method: The literature is reviewed and a set of variables that should be documented to enable reproducibility are grouped into three factors: Experiment, Data and Method. The metrics describe how well the factors have been documented for a paper. A total of 400 research papers from the conference series IJCAI and AAAI have been surveyed using the metrics. Findings: None of the papers document all of the variables. The metrics show that between 20% and 30% of the variables for each factor are documented. One of the metrics show statistically significant increase over time while the others show no change. Interpretation: The reproducibility scores decrease with in- creased documentation requirements. Improvement over time is found. Conclusion: Both hypotheses are supported.

关键词

empirical research; reproducible research; research method; documentation

全文

PDF


Towards Imperceptible and Robust Adversarial Example Attacks Against Neural Networks

摘要

Machine learning systems based on deep neural networks, being able to produce state-of-the-art results on various perception tasks, have gained mainstream adoption in many applications. However, they are shown to be vulnerable to adversarial example attack, which generates malicious output by adding slight perturbations to the input. Previous adversarial example crafting methods, however, use simple metrics to evaluate the distances between the original examples and the adversarial ones, which could be easily detected by human eyes. In addition, these attacks are often not robust due to the inevitable noises and deviation in the physical world. In this work, we present a new adversarial example attack crafting method, which takes the human perceptual system into consideration and maximizes the noise tolerance of the crafted adversarial example. Experimental results demonstrate the efficacy of the proposed technique.

关键词

machine learning; adversarial; security; DNN

全文

PDF


Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing Their Input Gradients

摘要

Deep neural networks have proven remarkably effective at solving many classification problems, but have been criticized recently for two major weaknesses: the reasons behind their predictions are uninterpretable, and the predictions themselves can often be fooled by small adversarial perturbations. These problems pose major obstacles for the adoption of neural networks in domains that require security or transparency. In this work, we evaluate the effectiveness of defenses that differentiably penalize the degree to which small changes in inputs can alter model predictions. Across multiple attacks, architectures, defenses, and datasets, we find that neural networks trained with this input gradient regularization exhibit robustness to transferred adversarial examples generated to fool all of the other models. We also find that adversarial examples generated to fool gradient-regularized models fool all other models equally well, and actually lead to more "legitimate," interpretable misclassifications as rated by people (which we confirm in a human subject experiment). Finally, we demonstrate that regularizing input gradients makes them more naturally interpretable as rationales for model predictions. We conclude by discussing this relationship between interpretability and robustness in deep neural networks.

关键词

adversarial machine learning; interpretability; double backpropagation; regularization; input gradients; explanations; distillation; adversarial examples; adversarial defenses

全文

PDF


Beyond Sparsity: Tree Regularization of Deep Models for Interpretability

摘要

The lack of interpretability remains a key barrier to the adoption of deep models in many applications. In this work, we explicitly regularize deep models so human users might step through the process behind their predictions in little time. Specifically, we train deep time-series models so their class-probability predictions have high accuracy while being closely modeled by decision trees with few nodes. Using intuitive toy examples as well as medical tasks for treating sepsis and HIV, we demonstrate that this new tree regularization yields models that are easier for humans to simulate than simpler L1 or L2 penalties without sacrificing predictive power.

关键词

explainable AI; neural networks; decision trees

全文

PDF


Coupled Deep Learning for Heterogeneous Face Recognition

摘要

Heterogeneous face matching is a challenge issue in face recognition due to large domain difference as well as insufficient pairwise images in different modalities during training. This paper proposes a coupled deep learning (CDL) approach for the heterogeneous face matching. CDL seeks a shared feature space in which the heterogeneous face matching problem can be approximately treated as a homogeneous face matching problem. The objective function of CDL mainly includes two parts. The first part contains a trace norm and a block-diagonal prior as relevance constraints, which not only make unpaired images from multiple modalities be clustered and correlated, but also regularize the parameters to alleviate overfitting. An approximate variational formulation is introduced to deal with the difficulties of optimizing low-rank constraint directly. The second part contains a cross modal ranking among triplet domain specific images to maximize the margin for different identities and increase data for a small amount of training samples. Besides, an alternating minimization method is employed to iteratively update the parameters of CDL. Experimental results show that CDL achieves better performance on the challenging CASIA NIR-VIS 2.0 face recognition database, the IIIT-D Sketch database, the CUHK Face Sketch (CUFS), and the CUHK Face Sketch FERET (CUFSF), which significantly outperforms state-of-the-art heterogeneous face recognition methods.

关键词

Heterogeneous Face Recognition; CNN;

全文

PDF


A Low-Cost Ethics Shaping Approach for Designing Reinforcement Learning Agents

摘要

This paper proposes a low-cost, easily realizable strategy to equip a reinforcement learning (RL) agent the capability of behaving ethically. Our model allows the designers of RL agents to solely focus on the task to achieve, without having to worry about the implementation of multiple trivial ethical patterns to follow. Based on the assumption that the majority of human behavior, regardless which goals they are achieving, is ethical, our design integrates human policy with the RL policy to achieve the target objective with less chance of violating the ethical code that human beings normally obey.

关键词

machine ethics; ethical agents; ethical decision making; reward shaping; reinforcement learning; human feedback

全文

PDF


Deception Detection in Videos

摘要

We present a system for covert automated deception detection using information available in a video. We study the importance of different modalities like vision, audio and text for this task. On the vision side, our system uses classifiers trained on low level video features which predict human micro-expressions. We show that predictions of high-level micro-expressions can be used as features for deception prediction. Surprisingly, IDT (Improved Dense Trajectory) features which have been widely used for action recognition, are also very good at predicting deception in videos. We fuse the score of classifiers trained on IDT features and high-level micro-expressions to improve performance. MFCC (Mel-frequency Cepstral Coefficients) features from the audio domain also provide a significant boost in performance, while information from transcripts is not very beneficial for our system. Using various classifiers, our automated system obtains an AUC of 0.877 (10-fold cross-validation) when evaluated on subjects which were not part of the training set. Even though state-of-the-art methods use human annotations of micro-expressions for deception detection, our fully automated approach outperforms them by 5%. When combined with human annotations of micro-expressions, our AUC improves to 0.922. We also present results of a user-study to analyze how well do average humans perform on this task, what modalities they use for deception detection and how they perform if only one modality is accessible.

关键词

deception detection; videos

全文

PDF


Cascade and Parallel Convolutional Recurrent Neural Networks on EEG-based Intention Recognition for Brain Computer Interface

摘要

Brain-Computer Interface (BCI) is a system empowering humans to communicate with or control the outside world with exclusively brain intentions. Electroencephalography (EEG) based BCIs are promising solutions due to their convenient and portable instruments. Despite the extensive research of EEG in recent years, it is still challenging to interpret EEG signals effectively due to the massive noises in EEG signals (e.g., low signal-noise ratio and incomplete EEG signals), and difficulties in capturing the inconspicuous relationships between EEG signals and certain brain activities. Most existing works either only consider EEG as chain-like sequences neglecting complex dependencies between adjacent signals or requiring pre-processing such as transforming EEG waves into images. In this paper, we introduce both cascade and parallel convolutional recurrent neural network models for precisely identifying human intended movements and instructions effectively learning the compositional spatio-temporal representations of raw EEG streams. Extensive experiments on a large scale movement intention EEG dataset (108 subjects,3,145,160 EEG records) have demonstrated that both models achieve high accuracy near 98.3% and outperform a set of baseline methods and most recent deep learning based EEG recognition models, yielding a significant accuracy increase of 18% in the cross-subject validation scenario. The developed models are further evaluated with a real-world BCI and achieve a recognition accuracy of 93% over five instruction intentions. This suggests the proposed models are able to generalize over different kinds of intentions and BCI systems.

关键词

全文

PDF


WiFi-Based Human Identification via Convex Tensor Shapelet Learning

摘要

We propose AutoID, a human identification system that leverages the measurements from existing WiFi-enabled Internet of Things (IoT) devices and produces the identity estimation via a novel sparse representation learning technique. The key idea is to use the unique fine-grained gait patterns of each person revealed from the WiFi Channel State Information (CSI) measurements, technically referred to as shapelet signatures, as the "fingerprint" for human identification. For this purpose, a novel OpenWrt-based IoT platform is designed to collect CSI data from commercial IoT devices. More importantly, we propose a new optimization-based shapelet learning framework for tensors, namely Convex Clustered Concurrent Shapelet Learning (C3SL), which formulates the learning problem as a convex optimization. The global solution of C3SL can be obtained efficiently with a generalized gradient-based algorithm, and the three concurrent regularization terms reveal the inter-dependence and the clustering effect of the CSI tensor data. Extensive experiments are conducted in multiple real-world indoor environments, showing that AutoID achieves an average human identification accuracy of 91% from a group of 20 people. As a combination of novel sensing and learning platform, AutoID attains substantial progress towards a more accurate, cost-effective and sustainable human identification system for pervasive implementations.

关键词

Human Identification; Shapelet Learning

全文

PDF


Externally Supported Models for Efficient Computation of Paracoherent Answer Sets

摘要

Answer Set Programming (ASP) is a well-established formalism for nonmonotonic reasoning.While incoherence, the non-existence of answer sets for some programs, is an important feature of ASP, it has frequently been criticised and indeed has some disadvantages, especially for query answering.Paracoherent semantics have been suggested as a remedy, which extend the classical notion of answer sets to draw meaningful conclusions also from incoherent programs. In this paper we present an alternative characterization of the two major paracoherent semantics in terms of (extended) externally supported models. This definition uses a transformation of ASP programs that is more parsimonious than the classic epistemic transformation used in recent implementations.A performance comparison carried out on benchmarks from ASP competitions shows that the usage of the new transformation brings about performance improvements that are independent of the underlying algorithms.

关键词

Knowledge Representation and Reasoning; Answer Set Programming; Paracoherent Answer Sets

全文

PDF


Combining Rules and Ontologies into Clopen Knowledge Bases

摘要

We propose Clopen Knowledge Bases (CKBs) as a new formalism combining Answer Set Programming (ASP) with ontology languages based on first-order logic. CKBs generalize the prominent r-hybrid and DL+LOG languages of Rosati, and are more flexible for specification of problems that combine open-world and closed-world reasoning. We argue that the guarded negation fragment of first-order logic(GNFO)—a very expressive fragment that subsumes many prominent ontology languages like Description Logics (DLs) and the guarded fragment—is an ontology language that can be used in CKBs while enjoying decidability for basic reasoning problems. We further show how CKBs can be used with expressive DLs of the ALC family, and obtain worst-case optimal complexity results in this setting. For DL-based CKBs, we define a fragment called separable CKBs (which still strictly subsumes r-hybrid and DL+LOG knowledge bases), and show that they can be rather efficiently translated into standard ASP programs. This approach allows us to perform basic inference from separable CKBs by reusing existing efficient ASP solvers. We have implemented the approach for separable CKBs containing ontologies in the DL ALCH, and present in this paper some promising empirical results for real-life data. They show that our approach provides a dramatic improvement over a naive implementation based on a translation of such CKBs into dl-programs.

关键词

KR&R; Computational Complexity of Reasoning; Description Logics; Knowledge Representation Languages; Logic Programming; Nonmonotonic Reasoning

全文

PDF


How Many Properties Do We Need for Gradual Argumentation?

摘要

The study of properties of gradual evaluation methods in argumentation has received increasing attention in recent years, with studies devoted to various classes of frameworks/methods leading to conceptually similar but formally distinct properties in different contexts. In this paper we provide a systematic analysis for this research landscape by making three main contributions. First, we identify groups of conceptually related properties in the literature, which can be regarded as based on common patterns and, using these patterns, we evidence that many further properties can be considered. Then, we provide a simplifying and unifying perspective for these properties by showing that they are all implied by the parametric principles of (either strict or non-strict) balance and monotonicity. Finally, we show that (instances of) these principles are satisfied by several quantitative argumentation formalisms in the literature, thus confirming their general validity and their utility to support a compact, yet comprehensive, analysis of properties of gradual argumentation.

关键词

Bipolar Argumentation; Quantitative Argumentation; Gradual Evaluation

全文

PDF


Situation Calculus Semantics for Actual Causality

摘要

The definitions of actual cause given by Pearl and Halpern (HP) in the framework of causal models provided vital computational insight into an old philosophical problem but by no means resolved it. One source of concern is the lack of objective criteria for selecting possible worlds to be admitted into the counterfactual analysis, epitomized by the competition between multiple proposals by HP and others. Another concern is due to the modest expressivity of propositional-level structural equations which limits their applicability and, arguably, contributes to the the former problem. We tackle both of these issues using a novel approach. We build our definition of actual cause from first principles in the context of atemporal situation calculus (SC) action theories with sequential actions. As a result, we can successfully identify actual causes of conditions expressed in first-order logic. We validate the HP approach by providing a formal translation from causal models to SC and proving a relationship between our definitions of actual cause and that of HP. Using well-known and new examples, we show that long-standing disagreements between alternative definitions of actual causality can be mitigated by faithful SC modelling of the domains.

关键词

actual causality; situation calculus

全文

PDF


Complexity of Verification in Incomplete Argumentation Frameworks

摘要

Abstract argumentation frameworks are a well-established formalism to model nonmonotonic reasoning processes. However, the standard model cannot express incomplete or conflicting knowledge about the state of a given argumentation. Previously, argumentation frameworks were extended to allow uncertainty regarding the set of attacks or the set of arguments. We combine both models into a model of general incompleteness, complement previous results on the complexity of the verification problem in incomplete argumentation frameworks, and provide a full complexity map covering all three models and all classical semantics. Our main result shows that the complexity of verifying the preferred semantics rises from coNP- to Sigma^p_2-completeness when allowing uncertainty about either attacks or arguments, or both.

关键词

abstract argumentation; argumentation framework; incomplete knowledge; verification; computational complexity

全文

PDF


Goal-Driven Query Answering for Existential Rules With Equality

摘要

Inspired by the magic sets for Datalog, we present a novel goal-driven approach for answering queries over terminating existential rules with equality (aka TGDs and EGDs). Our technique improves the performance of query answering by pruning the consequences that are not relevant for the query. This is challenging in our setting because equalities can potentially affect all predicates in a dataset. We address this problem by combining the existing singularization technique with two new ingredients: an algorithm for identifying the rules relevant to a query and a new magic sets algorithm. We show empirically that our technique can significantly improve the performance of query answering, and that it can mean the difference between answering a query in a few seconds or not being able to process the query at all.

关键词

reasoning; query answering

全文

PDF


LTLf/LDLf Non-Markovian Rewards

摘要

In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian, i.e., depends on the last state and action. This dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request. Extending MDPs to handle non-Markovian reward functions was the subject of two previous lines of work. Both use LTL variants to specify the reward function and then compile the new model back into a Markovian model. Building on recent progress in temporal logics over finite traces, we adopt LDLf for specifying non-Markovian rewards and provide an elegant automata construction for building a Markovian model, which extends that of previous work and offers strong minimality and compositionality guarantees.

关键词

MDPs; non-Markovian Rewards; LTLf/LDLf

全文

PDF


Weighted Abstract Dialectical Frameworks

摘要

Abstract Dialectical Frameworks (ADFs) generalize Dung's argumentation frameworks allowing various relationships among arguments to be expressed in a systematic way. We further generalize ADFs so as to accommodate arbitrary acceptance degrees for the arguments. This makes ADFs applicable in domains where both the initial status of arguments and their relationship are only insufficiently specified by Boolean functions. We define all standard ADF semantics for the weighted case, including grounded, preferred and stable semantics. We illustrate our approach using acceptance degrees from the unit interval and show how other valuation structures can be integrated. In each case it is sufficient to specify how the generalized acceptance conditions are represented by formulas, and to specify the information ordering underlying the characteristic ADF operator. We also present complexity results for problems related to weighted ADFs.

关键词

argumentation; nonmonotonic reasoning

全文

PDF


SELF: Structural Equational Likelihood Framework for Causal Discovery

摘要

Causal discovery without intervention is well recognized as a challenging yet powerful data analysis tool, boosting the development of other scientific areas, such as biology, astronomy, and social science. The major technical difficulty behind the observation-based causal discovery is to effectively and efficiently identify causes and effects from correlated variables given the existence of significant noises. Previous studies mostly employ two very different methodologies under Bayesian network framework, namely global likelihood maximization and locally complexity analysis over marginal distributions. While these approaches are effective in their respective problem domains, in this paper, we show that they can be combined to formulate a new global optimization model with local statistical significance, called structural equational likelihood framework (or SELF in short). We provide thorough analysis on the soundness of the model under mild conditions and present efficient heuristic-based algorithms for scalable model training. Empirical evaluations using XGBoost validate the superiority of our proposal over state-of-the-art solutions, on both synthetic and real world causal structures.

关键词

causal discovery;structural equation model;likelihood function

全文

PDF


SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings

摘要

With the recent development of deep learning, research in AI has gained new vigor and prominence. While machine learning has succeeded in revitalizing many research fields, such as computer vision, speech recognition, and medical diagnosis, we are yet to witness impressive progress in natural language understanding. One of the reasons behind this unmatched expectation is that, while a bottom-up approach is feasible for pattern recognition, reasoning and understanding often require a top-down approach. In this work, we couple sub-symbolic and symbolic AI to automatically discover conceptual primitives from text and link them to commonsense concepts and named entities in a new three-level knowledge representation for sentiment analysis. In particular, we employ recurrent neural networks to infer primitives by lexical substitution and use them for grounding common and commonsense knowledge by means of multi-dimensional scaling.

关键词

Commonsense reasoning; NLP; LSTM; SenticNet

全文

PDF


Towards a Unified Framework for Syntactic Inconsistency Measures

摘要

A number of proposals have been made to define inconsistency measures. Each has its rationale. But to date, it is not clear how to delineate the space of options for measures, nor is it clear how we can classify measures systematically. In this paper, we introduce a general framework for comparing syntactic inconsistency measures. It uses the construction of an inconsistency graph for each knowledgebase. We then introduce abstractions of the inconsistency graph and use the hierarchy of the abstractions to classify a range of inconsistency measures.

关键词

inconsistency measurement; syntactic measures

全文

PDF


Convolutional 2D Knowledge Graph Embeddings

摘要

Link prediction for knowledge graphs is the task of predicting missing relationships between entities. Previous work on link prediction has focused on shallow, fast models which can scale to large knowledge graphs. However, these models learn less expressive features than deep, multi-layer models — which potentially limits performance. In this work we introduce ConvE, a multi-layer convolutional network model for link prediction, and report state-of-the-art results for several established datasets. We also show that the model is highly parameter efficient, yielding the same performance as DistMult and R-GCN with 8x and 17x fewer parameters. Analysis of our model suggests that it is particularly effective at modelling nodes with high indegree — which are common in highly-connected, complex knowledge graphs such as Freebase and YAGO3. In addition, it has been noted that the WN18 and FB15k datasets suffer from test set leakage, due to inverse relations from the training set being present in the test set — however, the extent of this issue has so far not been quantified. We find this problem to be severe: a simple rule-based model can achieve state-of-the-art results on both WN18 and FB15k. To ensure that models are evaluated on datasets where simply exploiting inverse relations cannot yield competitive results, we investigate and validate several commonly used datasets — deriving robust variants where necessary. We then perform experiments on these robust datasets for our own and several previously proposed models, and find that ConvE achieves state-of-the-art Mean Reciprocal Rank across all datasets.

关键词

knowledge graphs; representation learning; convolution

全文

PDF


TorusE: Knowledge Graph Embedding on a Lie Group

摘要

Knowledge graphs are useful for many artificial intelligence (AI) tasks. However, knowledge graphs often have missing facts. To populate the graphs, knowledge graph embedding models have been developed. Knowledge graph embedding models map entities and relations in a knowledge graph to a vector space and predict unknown triples by scoring candidate triples. TransE is the first translation-based method and it is well known because of its simplicity and efficiency for knowledge graph completion. It employs the principle that the differences between entity embeddings represent their relations. The principle seems very simple, but it can effectively capture the rules of a knowledge graph. However, TransE has a problem with its regularization. TransE forces entity embeddings to be on a sphere in the embedding vector space. This regularization warps the embeddings and makes it difficult for them to fulfill the abovementioned principle. The regularization also affects adversely the accuracies of the link predictions. On the other hand, regularization is important because entity embeddings diverge by negative sampling without it. This paper proposes a novel embedding model, TorusE, to solve the regularization problem. The principle of TransE can be defined on any Lie group. A torus, which is one of the compact Lie groups, can be chosen for the embedding space to avoid regularization. To the best of our knowledge, TorusE is the first model that embeds objects on other than a real or complex vector space, and this paper is the first to formally discuss the problem of regularization of TransE. Our approach outperforms other state-of-the-art approaches such as TransE, DistMult and ComplEx on a standard link prediction task. We show that TorusE is scalable to large-size knowledge graphs and is faster than the original TransE.

关键词

knowledge graph; knowledge inference

全文

PDF


Rational Inference Patterns Based on Conditional Logic

摘要

Conditional information is an integral part of representation and inference processes of causal relationships, temporal events, and even the deliberation about impossible scenarios of cognitive agents. For formalizing these inferences, a proper formal representation is needed. Psychological studies indicate that classical, monotonic logic is not the approriate model for capturing human reasoning: There are cases where the participants systematically deviate from classically valid answers, while in other cases they even endorse logically invalid ones. Many analyses covered the independent analysis of individual inference rules applied by human reasoners. In this paper we define inference patterns as a formalization of the joint usage or avoidance of these rules. Considering patterns instead of single inferences opens the way for categorizing inference studies with regard to their qualitative results. We apply plausibility relations which provide basic formal models for many theories of conditionals, nonmonotonic reasoning, and belief revision to asses the rationality of the patterns and thus the individual inferences drawn in the study. By this replacement of classical logic with formalisms most suitable for conditionals, we shift the basis of judging rationality from compatibility with classical entailment to consistency in a logic of conditionals. Using inductive reasoning on the plausibility relations we reverse engineer conditional knowledge bases as explanatory model for and formalization of the background knowledge of the participants. In this way the conditional knowledge bases derived from the inference patterns provide an explanation for the outcome of the study that generated the inference pattern.

关键词

Conditionals; Counterfactuals; Suppression Task; Rationality; Ranking Functions; OCF; c-Representations; Preferential Inference; Plausible Reasoning; Explaining Human Inference

全文

PDF


Dependence in Propositional Logic: Formula-Formula Dependence and Formula Forgetting – Application to Belief Update and Conservative Extension

摘要

Dependence is an important concept for many tasks in artificial intelligence. A task can be executed more efficiently by discarding something independent from the task. In this paper, we propose two novel notions of dependence in propositional logic: formula-formula dependence and formula forgetting. The first is a relation between formulas capturing whether a formula depends on another one, while the second is an operation that returns the strongest consequence independent of a formula. We also apply these two notions in two well-known issues: belief update and conservative extension. Firstly, we define a new update operator based on formula-formula dependence. Furthermore, we reduce conservative extension to formula forgetting.

关键词

全文

PDF


Answering Regular Path Queries over SQ Ontologies

摘要

We study query answering in the description logic SQ supporting qualified number restrictions on both transitive and non-transitive roles. Our main contributions are a tree-like model property for SQ-knowledge bases and, building upon this, an optimal automata-based algorithm for answering positive existential regular path queries in 2EXPTIME.

关键词

Knowledge Representation and Reasoning; Description Logics; Ontologies; Query-Answering; Regular Path Queries; Transitivity; Number Restrictions

全文

PDF


Towards Formal Definitions of Blameworthiness, Intention, and Moral Responsibility

摘要

We provide formal definitions of degree of blameworthiness and intention relative to an epistemic state (a probability over causal models and a utility function on outcomes). These, together with a definition of actual causality, provide the key ingredients for moral responsibility judgments. We show that these definitions give insight into commonsense intuitions in a variety of puzzling cases from the literature.

关键词

causality, (moral) responsibility, intention, blameworthiness

全文

PDF


Behavior Is Everything: Towards Representing Concepts with Sensorimotor Contingencies

摘要

AI has seen remarkable progress in recent years, due to a switch from hand-designed shallow representations, to learned deep representations. While these methods excel with plentiful training data, they are still far from the human ability to learn concepts from just a few examples by reusing previously learned conceptual knowledge in new contexts. We argue that this gap might come from a fundamental misalignment between human and typical AI representations: while the former are grounded in rich sensorimotor experience, the latter are typically passive and limited to a few modalities such as vision and text. We take a step towards closing this gap by proposing an interactive, behavior-based model that represents concepts using sensorimotor contingencies grounded in an agent's experience. On a novel conceptual learning and benchmark suite, we demonstrate that conceptually meaningful behaviors can be learned, given supervision via training curricula.

关键词

Concept Representation; Hierarchical Reinforcement Learning; Sensorimotor Contingencies; Curriculum Learning; Transfer Learning; Embodied Cognition

全文

PDF


Optimised Maintenance of Datalog Materialisations

摘要

To efficiently answer queries, datalog systems often materialise all consequences of a datalog program, so the materialisation must be updated whenever the input facts change. Several solutions to the materialisation update problem have been proposed. The Delete/Rederive (DRed) and the Backward/Forward (B/F) algorithms solve this problem for general datalog, but both contain steps that evaluate rules "backwards" by matching their heads to a fact and evaluating the partially instantiated rule bodies as queries. We show that this can be a considerable source of overhead even on very small updates. In contrast, the Counting algorithm does not evaluate the rules "backwards," but it can handle only nonrecursive rules. We present two hybrid approaches that combine DRed and B/F with Counting so as to reduce or even eliminate "backward" rule evaluation while still handling arbitrary datalog programs. We show empirically that our hybrid algorithms are usually significantly faster than existing approaches, sometimes by orders of magnitude.

关键词

Datalog; Materialisation; Incremental; Deletion

全文

PDF


Qualitative Reasoning About Cardinal Directions Using Answer Set Programming

摘要

We propose a novel method for representing and reasoning about an incomplete set of constraints about basic/disjunctive qualitative direction relations over simple/connected/disconnected regions, using Answer Set Programming, and prove its correctness with respect to cardinal direction calculus. We extend this method further with default qualitative direction constraints, and discuss its usefulness with some sample scenarios.

关键词

cardinal directional calculus; answer set programming

全文

PDF


Learning Abduction Using Partial Observability

摘要

Juba recently proposed a formulation of learning abductive reasoning from examples, in which both the relative plausibility of various explanations, as well as which explanations are valid, are learned directly from data. The main shortcoming of this formulation of the task is that it assumes access to full-information (i.e., fully specified) examples; relatedly, it offers no role for declarative background knowledge, as such knowledge is rendered redundant in the abduction task by complete information. In this work we extend the formulation to utilize such partially specified examples, along with declarative background knowledge about the missing data. We show that it is possible to use implicitly learned rules together with the explicitly given declarative knowledge to support hypotheses in the course of abduction. We also show how to use knowledge in the form of graphical causal models to refine the proposed hypotheses. Finally, we observe that when a small explanation exists, it is possible to obtain a much-improved guarantee in the challenging exception-tolerant setting. Such small, human-understandable explanations are of particular interest for potential applications of the task.

关键词

Abductive Reasoning; Knowledge Acquisition

全文

PDF


Probabilistic Inference Over Repeated Insertion Models

摘要

Distributions over rankings are used to model user preferences in various settings including political elections and electronic commerce. The Repeated Insertion Model (RIM) gives rise to various known probability distributions over rankings, in particular to the popular Mallows model. However, probabilistic inference on RIM is computationally challenging, and provably intractable in the general case. In this paper we propose an algorithm for computing the marginal probability of an arbitrary partially ordered set over RIM. We analyze the complexity of the algorithm in terms of properties of the model and the partial order, captured by a novel measure termed the "cover width." We also conduct an experimental study of the algorithm over serial and parallelized implementations. Building upon the relationship between inference with rank distributions and counting linear extensions, we investigate the inference problem when restricted to partial orders that lend themselves to efficient counting of their linear extensions.

关键词

Preferences; Probabilistic Inference; Rank distributions

全文

PDF


Question Answering as Global Reasoning Over Semantic Abstractions

摘要

We propose a novel method for exploiting the semantic structure of text to answer multiple-choice questions. The approach is especially suitable for domains that require reasoning over a diverse set of linguistic constructs but have limited training data. To address these challenges, we present the first system, to the best of our knowledge, that reasons over a wide range of semantic abstractions of the text, which are derived using off-the-shelf, general-purpose, pre-trained natural language modules such as semantic role labelers, coreference resolvers, and dependency parsers. Representing multiple abstractions as a family of graphs, we translate question answering (QA) into a search for an optimal subgraph that satisfies certain global and local properties. This formulation generalizes several prior structured QA systems. Our system, SEMANTICILP, demonstrates strong performance on two domains simultaneously. In particular, on a collection of challenging science QA datasets, it outperforms various state-of-the-art approaches, including neural models, broad coverage information retrieval, and specialized techniques using structured knowledge bases, by 2%-6%.

关键词

Question Answering; Reasoning over text; Knowledge Representation

全文

PDF


In Praise of Belief Bases: Doing Epistemic Logic Without Possible Worlds

摘要

We introduce a new semantics for a logic of explicit and implicit beliefs based on the concept of multi-agent belief base. Differently from existing Kripke-style semantics for epistemic logic in which the notions of possible world and doxastic/epistemic alternative are primitive, in our semantics they are non-primitive but are defined from the concept of belief base. We provide a complete axiomatization and a decidability result for our logic.

关键词

Logic; belief bases

全文

PDF


Maximum A Posteriori Inference in Sum-Product Networks

摘要

Sum-product networks (SPNs) are a class of probabilistic graphical models that allow tractable marginal inference. However, the maximum a posteriori (MAP) inference in SPNs is NP-hard. We investigate MAP inference in SPNs from both theoretical and algorithmic perspectives. For the theoretical part, we reduce general MAP inference to its special case without evidence and hidden variables; we also show that it is NP-hard to approximate the MAP problem to 2nε for fixed 0 ≤ ε < 1, where n is the input size. For the algorithmic part, we first present an exact MAP solver that runs reasonably fast and could handle SPNs with up to 1k variables and 150k arcs in our experiments. We then present a new approximate MAP solver with a good balance between speed and accuracy, and our comprehensive experiments on real-world datasets show that it has better overall performance than existing approximate solvers.

关键词

Sum-Product Networks

全文

PDF


Fair Inference on Outcomes

摘要

In this paper, we consider the problem of fair statistical inference involving outcome variables. Examples include classification and regression problems, and estimating treatment effects in randomized trials or observational data. The issue of fairness arises in such problems where some covariates or treatments are "sensitive," in the sense of having potential of creating discrimination. In this paper, we argue that the presence of discrimination can be formalized in a sensible way as the presence of an effect of a sensitive covariate on the outcome along certain causal pathways, a view which generalizes (Pearl 2009). A fair outcome model can then be learned by solving a constrained optimization problem. We discuss a number of complications that arise in classical statistical inference due to this view and provide workarounds based on recent work in causal and semi-parametric inference.

关键词

fair inference; mediation analysis; causal inference; algorithmic bias; constrained inference

全文

PDF


Stream Reasoning in Temporal Datalog

摘要

In recent years, there has been an increasing interest in extending traditional stream processing engines with logical, rule-based, reasoning capabilities. This poses significant theoretical and practical challenges since rules can derive new information and propagate it both towards past and future time points; as a result, streamed query answers can depend on data that has not yet been received, as well as on data that arrived far in the past. Stream reasoning algorithms, however, must be able to stream out query answers as soon as possible, and can only keep a limited number of previous input facts in memory. In this paper, we propose novel reasoning problems to deal with these challenges, and study their computational properties on Datalog extended with a temporal sort and the successor function (a core rule-based language for stream reasoning applications).

关键词

stream reasoning; temporal reasoning; datalog; query answering; stream processing

全文

PDF


On Consensus in Belief Merging

摘要

We define a consensus postulate in the propositional belief merging setting. In a nutshell, this postulate imposes the merged base to be consistent with the pieces of information provided by each agent involved in the merging process. The interplay of this new postulate with the IC postulates for belief merging is studied, and an incompatibility result is proved. The maximal sets of IC postulates which are consistent with the consensus postulate are exhibited. When satisfying some of the remaining IC postulates, consensus operators are shown to suffer from a weak inferential power. We then introduce two families of consensus operators having a better inferential power by setting aside some of these postulates.

关键词

Belief Merging; Consensus; IC postulates

全文

PDF


Open-World Knowledge Graph Completion

摘要

Knowledge Graphs (KGs) have been applied to many tasks including Web search, link prediction, recommendation, natural language processing, and entity linking. However, most KGs are far from complete and are growing at a rapid pace. To address these problems, Knowledge Graph Completion (KGC) has been proposed to improve KGs by filling in its missing connections. Unlike existing methods which hold a closed-world assumption, i.e., where KGs are fixed and new entities cannot be easily added, in the present work we relax this assumption and propose a new open-world KGC task. As a first attempt to solve this task we introduce an open-world KGC model called ConMask. This model learns embeddings of the entity's name and parts of its text-description to connect unseen entities to the KG. To mitigate the presence of noisy text descriptions, ConMask uses a relationship-dependent content masking to extract relevant snippets and then trains a fully convolutional neural network to fuse the extracted snippets with entities in the KG. Experiments on large data sets, both old and new, show that ConMask performs well in the open-world KGC task and even outperforms existing KGC models on the standard closed-world KGC task.

关键词

Knowledge Graph Completion; Open-World; Representation Learning; Knowledge Graph Representation; Knowledge Graph Embedding

全文

PDF


Visual Explanation by High-Level Abduction: On Answer-Set Programming Driven Reasoning About Moving Objects

摘要

We propose a hybrid architecture for systematically computing robust visual explanation(s) encompassing hypothesis formation, belief revision, and default reasoning with video data. The architecture consists of two tightly integrated synergistic components: (1) (functional) answer set programming based abductive reasoning with space-time tracklets as native entities; and (2) a visual processing pipeline for detection based object tracking and motion analysis. We present the formal framework, its general implementation as a (declarative) method in answer set programming, and an example application and evaluation based on two diverse video datasets: the MOTChallenge benchmark developed by the vision community, and a recently developed Movie Dataset.

关键词

Visual Abduction; Answer-Set Programming; Reasoning about Moving Objects

全文

PDF


A Framework and Positive Results for IAR-answering

摘要

Inconsistency-tolerant semantics, like the IAR semantics, have been proposed as means to compute meaningful query answers over inconsistent Description Logic (DL) ontologies. So far query answering under the IAR semantics (IAR-answering) is known to be tractable only for arguably weak DLs like DL-Lite and the quite restricted EL⊥nr fragment of EL⊥. Towards providing a systematic study of IAR-answering, in the current paper we first present a general framework/algorithm for IAR-answering which applies to arbitrary DLs but need not terminate. Nevertheless, this framework allows us to develop a sufficient condition for tractability of IAR-answering and hence of termination of our algorithm. We then show that this condition is always satisfied by the arguably expressive DL DL-Litebool, providing the first positive result for IAR-answering over a non-Horn-DL. In addition, recent results show that this condition usually holds for real-world ontologies and techniques and algorithms for checking it in practice have also been studied recently; thus, overall our results are highly relevant in practice. Finally, we have provided a prototype implementation and a preliminary evaluation obtaining encouraging results.

关键词

全文

PDF


Repairing Ontologies via Axiom Weakening

摘要

Ontology engineering is a hard and error-prone task, in which small changes may lead to errors, or even produce an inconsistent ontology. As ontologies grow in size, the need for automated methods for repairing inconsistencies while preserving as much of the original knowledge as possible increases. Most previous approaches to this task are based on removing a few axioms from the ontology to regain consistency. We propose a new method based on weakening these axioms to make them less restrictive, employing the use of refinement operators. We introduce the theoretical framework for weakening DL ontologies, propose algorithms to repair ontologies based on the framework, and provide an analysis of the computational complexity. Through an empirical analysis made over real-life ontologies, we show that our approach preserves significantly more of the original knowledge of the ontology than removing axioms.

关键词

repair; weakening; complexity

全文

PDF


Measuring Strong Inconsistency

摘要

We address the issue of quantitatively assessing the severity of inconsistencies in nonmonotonic frameworks. While measuring inconsistency in classical logics has been investigated for some time now, taking the nonmonotonicity into account poses new challenges. In order to tackle them, we focus on the structure of minimal strongly kb-inconsistent subsets of a knowledge base kb---a generalization of minimal inconsistency to arbitrary, possibly nonmonotonic, frameworks. We propose measures based on this notion and investigate their behavior in a nonmonotonic setting by revisiting existing rationality postulates, analyzing the compliance of the proposed measures with these postulates, and by investigating their computational complexity.

关键词

inconsistency handling; nonmonotonic reasoning

全文

PDF


Splitting an LPMLN Program

摘要

The technique called splitting sets has been proven useful in simplifying the investigation of Answer Set Programming (ASP). In this paper, we investigate the splitting set theorem for LPMLN that is a new extension of ASP created by combining the ideas of ASP and Markov Logic Networks (MLN). Firstly, we extend the notion of splitting sets to LPMLN programs and present the splitting set theorem for LPMLN. Then, the use of the theorem for simplifying several LPMLN inference tasks is illustrated. After that, we give two parallel approaches for solving LPMLN programs via using the theorem. The preliminary experimental results show that these approaches are alternative ways to promote an LPMLN solver.

关键词

logic programs

全文

PDF


Incorporating GAN for Negative Sampling in Knowledge Representation Learning

摘要

Knowledge representation learning aims at modeling knowledge graph by encoding entities and relations into a low dimensional space. Most of the traditional works for knowledge embedding need negative sampling to minimize a margin-based ranking loss. However, those works construct negative samples through a random mode, by which the samples are often too trivial to fit the model efficiently. In this paper, we propose a novel knowledge representation learning framework based on Generative Adversarial Networks (GAN). In this GAN-based framework, we take advantage of a generator to obtain high-quality negative samples. Meanwhile, the discriminator in GAN learns the embeddings of the entities and relations in knowledge graph. Thus, we can incorporate the proposed GAN-based framework into various traditional models to improve the ability of knowledge representation learning. Experimental results show that our proposed GAN-based framework outperforms baselines on triplets classification and link prediction tasks.

关键词

Knowledge graph embedding; Knowledge Representation

全文

PDF


Forgetting and Unfolding for Existential Rules

摘要

Existential rules, a family of expressive ontology languages, inherit desired expressive and reasoning properties from both description logics and logic programming. On the other hand, forgetting is a well studied operation for ontology reuse, obfuscation and analysis. Yet it is challenging to establish a theory of forgetting for existential rules. In this paper, we lay the foundation for a theory of forgetting for existential rules by developing a novel notion of unfolding. In particular, we introduce a definition of forgetting for existential rules in terms of query answering and provide a characterisation of forgetting by the unfolding. A result of forgetting may not be expressible in existential rules, and we then capture the expressibility of forgetting by a variant of boundedness. While the expressibility is undecidable in general, we identify a decidable fragment. Finally, we provide an algorithm for forgetting in this fragment.

关键词

Ontology; Rules; Forgetting

全文

PDF


Machine-Translated Knowledge Transfer for Commonsense Causal Reasoning

摘要

This paper studies the problem of multilingual causal reasoning in resource-poor languages. Existing approaches, translating into the most probable resource-rich language such as English, suffer in the presence of translation and language gaps between different cultural area, which leads to the loss of causality. To overcome these challenges, our goal is thus to identify key techniques to construct a new causality network of cause-effect terms, targeted for the machine-translated English, but without any language-specific knowledge of resource-poor languages. In our evaluations with three languages, Korean, Chinese, and French, our proposed method consistently outperforms all baselines, achieving up-to 69.0% reasoning accuracy, which is close to the state-of-the-art accuracy 70.2% achieved on English.

关键词

全文

PDF


Measuring Conditional Independence by Independent Residuals: Theoretical Results and Application in Causal Discovery

摘要

We investigate the relationship between conditional independence (CI) x ⊥ y|Z and the independence of two residuals x – E(x|Z) ⊥ –E(y|Z), where x and y are two random variables, and Z is a set of random variables. We show that if x, y and Z are generated by following linear structural equation model and all external influences follow Gaussian distributions, then x ⊥ y|Z if and only if x – E(x|Z) ⊥ y – E(y|Z). That is, the test of x ⊥ y|Z can be relaxed to a simpler unconditional independence test of x – E(x|Z) ⊥ y – E(y|Z). Furthermore, if all these external influences follow non-Gaussian distributions and the model satisfies structural faithfulness condition, then we have x ⊥ y|Z ⇔ x – E(x|Z) ⊥ y – E(y|Z). We apply the results above to the causal discovery problem, where the causal directions are generally determined by a set of V-structures and their consistent propagations, so CI test-based methods can return a set of Markov equivalence classes. We show that in linear non-Gaussian context, x – E(x|Z) ⊥ y – E(y|Z) ⇒ x – E(x|Z) ⊥ z or y – E(y|Z ⊥ z (∀z ∈ Z) if Z is a minimal d-separator, which implies z causes x (or y) if z directly connects to x (or y). Therefore, we conclude that CIs have useful information for distinguishing Markov equivalence classes. In summary, compared with the existing discretization-based and kernel-based CI testing methods, the proposed method provides a simpler way to measure CI, which needs only one unconditional independence test and two regression operations. When being applied to causal discovery, it can find more causal relationships, which is experimentally validated.

关键词

Causal inference; conditional independence testing; Regression

全文

PDF


Fairness in Decision-Making — The Causal Explanation Formula

摘要

AI plays an increasingly prominent role in society since decisions that were once made by humans are now delegated to automated systems. These systems are currently in charge of deciding bank loans, criminals' incarceration, and the hiring of new employees, and it's not difficult to envision that they will in the future underpin most of the decisions in society. Despite the high complexity entailed by this task, there is still not much understanding of basic properties of such systems. For instance, we currently cannot detect (neither explain nor correct) whether an AI system can be deemed fair (i.e., is abiding by the decision-constraints agreed by society) or it is reinforcing biases and perpetuating a preceding prejudicial practice. Issues of discrimination have been discussed extensively in political and legal circles, but there exists still not much understanding of the formal conditions that a system must meet to be deemed fair. In this paper, we use the language of structural causality (Pearl, 2000) to fill in this gap. We start by introducing three new fine-grained measures of transmission of change from stimulus to effect, which we called counterfactual direct (Ctf-DE), indirect (Ctf-IE), and spurious (Ctf-SE) effects. We then derive what we call the causal explanation formula, which allows the AI designer to quantitatively evaluate fairness and explain the total observed disparity of decisions through different discriminatory mechanisms. We apply these measures to various discrimination analysis tasks and run extensive simulations, including detection, evaluation, and optimization of decision-making under fairness constraints. We conclude studying the trade-off between different types of fairness criteria (outcome and procedural), and provide a quantitative approach to policy implementation and the design of fair AI systems.

关键词

Knowledge Representation and Reasoning: Action, Change, and Causality

全文

PDF


Embedding of Hierarchically Typed Knowledge Bases

摘要

Embedding has emerged as an important approach to prediction, inference, data mining and information retrieval based on knowledge bases and various embedding models have been presented. Most of these models are "typeless," namely, treating a knowledge base solely as a collection of instances without considering the types of the entities therein. In this paper, we investigate the use of entity type information for knowledge base embedding. We present a framework that augments a generic "typeless" embedding model to a typed one. The framework interprets an entity type as a constraint on the set of all entities and let these type constraints induce isomorphically a set of subsets in the embedding space. Additional cost functions are then introduced to model the fitness between these constraints and the embedding of entities and relations. A concrete example scheme of the framework is proposed. We demonstrate experimentally that this framework offers improved embedding performance over the typeless models and other typed models.

关键词

Knowledge Base Completion; Embedding; Entity Types

全文

PDF


On the Satisfiability Problem of Patterns in SPARQL 1.1

摘要

The pattern satisfiability is a fundamental problem for SPARQL. This paper provides a complete analysis of decidability/undecidability of satisfiability problems for SPARQL 1.1 patterns. A surprising result is the undecidability of satisfiability for SPARQL 1.1 patterns when only AND and MINUS are expressible. Also, it is shown that any fragment of SPARQL 1.1 without expressing both AND and MINUS is decidable. These results provide a guideline for future SPARQL query language design and implementation.

关键词

全文

PDF


Predicting Vehicular Travel Times by Modeling Heterogeneous Influences Between Arterial Roads

摘要

Predicting travel times of vehicles in urban settings is a useful and tangible quantity of interest in the context of intelligent transportation systems. We address the problem of travel time prediction in arterial roads using data sampled from probe vehicles. There is only a limited literature on methods using data input from probe vehicles. The spatio-temporal dependencies captured by existing data driven approaches are either too detailed or very simplistic. We strike a balance of the existing data driven approaches to account for varying degrees of influence a given road may experience from its neighbors, while controlling the number of parameters to be learnt. Specifically, we use a NoisyOR conditional probability distribution (CPD) in conjunction with a dynamic Bayesian network (DBN) to model state transitions of various roads. We propose an efficient algorithm to learn model parameters. We also propose an algorithm for predicting travel times on trips of arbitrary durations. Using synthetic and real world data traces we demonstrate the superior performance of the proposed method under different traffic conditions.

关键词

Travel time prediction; Dynamic Bayesian Networks; Expectation Maximization; Particle filtering

全文

PDF


Deep-Treat: Learning Optimal Personalized Treatments From Observational Data Using Neural Networks

摘要

We propose a novel approach for constructing effective treatment policies when the observed data is biased and lacks counterfactual information. Learning in settings where the observed data does not contain all possible outcomes for all treatments is difficult since the observed data is typically biased due to existing clinical guidelines. This is an important problem in the medical domain as collecting unbiased data is expensive and so learning from the wealth of existing biased data is a worthwhile task. Our approach separates the problem into two stages: first we reduce the bias by learning a representation map using a novel auto-encoder network---this allows us to control the trade-off between the bias-reduction and the information loss---and then we construct effective treatment policies on the transformed data using a novel feedforward network. Separation of the problem into these two stages creates an algorithm that can be adapted to the problem at hand---the bias-reduction step can be performed as a preprocessing step for other algorithms. We compare our algorithm against state-of-art algorithms on two semi-synthetic datasets and demonstrate that our algorithm achieves a significant improvement in performance.

关键词

全文

PDF


DeepHeart: Semi-Supervised Sequence Learning for Cardiovascular Risk Prediction

摘要

We train and validate a semi-supervised, multi-task LSTM on 57,675 person-weeks of data from off-the-shelf wearable heart rate sensors, showing high accuracy at detecting multiple medical conditions, including diabetes (0.8451), high cholesterol (0.7441), high blood pressure (0.8086), and sleep apnea (0.8298). We compare two semi-supervised training methods, semi-supervised sequence learning and heuristic pretraining, and show they outperform hand-engineered biomarkers from the medical literature. We believe our work suggests a new approach to patient risk stratification based on cardiovascular risk scores derived from popular wearables such as Fitbit, Apple Watch, or Android Wear.

关键词

Biomedical/Bioinformatics; Bio/Medicine; Semisupervised Learning;

全文

PDF


CSWA: Aggregation-Free Spatial-Temporal Community Sensing

摘要

In this paper, we present a novel community sensing paradigm CSWA –Community Sensing Without Sensor/Location Data Aggregation. CSWA is designed to obtain the environment information (e.g., air pollution or temperature) in each subarea of the target area, without aggregating sensor and location data collected by community members. CSWA operates on top of a secured peer-to-peer network over the community members and proposes a novel Decentralized Spatial-Temporal Compressive Sensing framework based on Parallelized Stochastic Gradient Descent. Through learning the low-rank structure via distributed optimization, CSWA approximates the value of the sensor data in each subarea (both covered and uncovered) for each sensing cycle using the sensor data locally stored in each member’s mobile device. Simulation experiments based on real-world datasets demonstrate that CSWA exhibits low approximation error (i.e., less than 0.2 centi-degree in city-wide temperature sensing task and 10 units of PM2.5 index in urban air pollution sensing) and performs comparably to (sometimes better than) state-of-the-art algorithms based on the data aggregation and centralized computation.

关键词

全文

PDF


Multi-Level Variational Autoencoder: Learning Disentangled Representations From Grouped Observations

摘要

We would like to learn a representation of the data that reflects the semantics behind a specific grouping of the data, where within a group the samples share a common factor of variation. For example, consider a set of face images grouped by identity. We wish to anchor the semantics of the grouping into a disentangled representation that we can exploit. However, existing deep probabilistic models often assume that the samples are independent and identically distributed, thereby disregard the grouping information. We present the Multi-Level Variational Autoencoder (ML-VAE), a new deep probabilistic model for learning a disentangled representation of grouped data. The ML-VAE separates the latent representation into semantically relevant parts by working both at the group level and the observation level, while retaining efficient test-time inference. We experimentally show that our model (i) learns a semantically meaningful disentanglement, (ii) enables control over the latent representation, and (iii) generalises to unseen groups.

关键词

Machine Learning Applications; Probabilistic Inference; Applications of Unsupervised Learning

全文

PDF


Dress Fashionably: Learn Fashion Collocation With Deep Mixed-Category Metric Learning

摘要

In this paper, we seek to enable machine to answer questions like, given a clutch bag, what kind of skirt, heel and even accessory best fashionably collocate with it ? This problem, dubbed fashion collocation, has almost been neglected by researchers due to the large uncertainty lies in fashion collocation and professional expertise required to address it. In this paper, we narrow down the well-collocated samples to be fashion images shared on fashion websites, with which we propose an end-to-end trainable deep mixed-category metric learning method to project well-collocated clothing items to lie close but items violating well-collocation far apart in the deep embedding space. Specifically, we simultaneously model the intra-category exclusiveness and cross-category inclusiveness of fashion collocation by feeding a set of well-collocated clothing items and corresponding bad-collocated clothing items to the deep neural network, further a hard-aware online exemplar mining strategy is designed to force the whole neural network to be trainable and learn discriminative features at the early and later training stages respectively. To motivate more research in fashion collocation, we collect a dataset of 0.2 million fashionably well-collocated images consisting of either on-body or off-body clothing items or accessories. Extensive experimental results show the feasibility and superiority of our method.

关键词

全文

PDF


Modeling Scientific Influence for Research Trending Topic Prediction

摘要

With the growing volume of publications in the Computer Science (CS) discipline, tracking the research evolution and predicting the future research trending topics are of great importance for researchers to keep up with the rapid progress of research. Within a research area, there are many top conferences that publish the latest research results. These conferences mutually influence each other and jointly promote the development of the research area. To predict the trending topics of mutually influenced conferences, we propose a correlated neural influence model, which has the ability to capture the sequential properties of research evolution in each individual conference and discover the dependencies among different conferences simultaneously. The experiments conducted on a scientific dataset including conferences in artificial intelligence and data mining show that our model consistently outperforms the other state-of-the-art methods. We also demonstrate the interpretability and predictability of the proposed model by providing its answers to two questions of concern, i.e., what the next rising trending topics are and for each conference who the most influential peer is.

关键词

Scientific influence; Research trend prediction

全文

PDF


Tap and Shoot Segmentation

摘要

We present a new segmentation method that leverages latent photographic information available at the moment of taking pictures. Photography on a portable device is often done by tapping to focus before shooting the picture. This tap-and-shoot interaction for photography not only specifies the region of interest but also yields useful focus/defocus cues for image segmentation. However, most of the previous interactive segmentation methods address the problem of image segmentation in a post-processing scenario without considering the action of taking pictures. We propose a learning-based approach to this new tap-and-shoot scenario of interactive segmentation. The experimental results on various datasets show that, by training a deep convolutional network to integrate the selection and focus/defocus cues, our method can achieve higher segmentation accuracy in comparison with existing interactive segmentation methods.

关键词

Interactive image segmentation

全文

PDF


HARP: Hierarchical Representation Learning for Networks

摘要

We present HARP, a novel method for learning low dimensional embeddings of a graph’s nodes which preserves higher-order structural features. Our proposed method achieves this by compressing the input graph prior to embedding it, effectively avoiding troublesome embedding configurations (i.e. local minima) which can pose problems to non-convex optimization. HARP works by finding a smaller graph which approximates the global structure of its input. This simplified graph is used to learn a set of initial representations, which serve as good initializations for learning representations in the original, detailed graph. We inductively extend this idea, by decomposing a graph in a series of levels, and then embed the hierarchy of graphs from the coarsest one to the original graph. HARP is a general meta-strategy to improve all of the state-of-the-art neural algorithms for embedding graphs, including DeepWalk, LINE, and Node2vec. Indeed, we demonstrate that applying HARP’s hierarchical paradigm yields improved implementations for all three of these methods, as evaluated on classification tasks on real-world graphs such as DBLP, BlogCatalog, and CiteSeer, where we achieve a performance gain over the original implementations by up to 14% Macro F1.

关键词

social networks; deep learning; latent representations; network classification

全文

PDF


Latent Sparse Modeling of Longitudinal Multi-Dimensional Data

摘要

We propose a tensor-based approach to analyze multi-dimensional data describing sample subjects. It simultaneously discovers patterns in features and reveals past temporal points that have impact on current outcomes. The model coefficient, a k-mode tensor, is decomposed into a summation of k tensors of the same dimension. To accomplish feature selection, we introduce the tensor '"atent LF,1 norm" as a grouped penalty in our formulation. Furthermore, the proposed model takes into account within-subject correlations by developing a tensor-based quadratic inference function. We provide an asymptotic analysis of our model when the sample size approaches to infinity. To solve the corresponding optimization problem, we develop a linearized block coordinate descent algorithm and prove its convergence for a fixed sample size. Computational results on synthetic datasets and real-file fMRI and EEG problems demonstrate the superior performance of the proposed approach over existing techniques.

关键词

quadratic inference function; tensor; longitudinal data; latent sparse

全文

PDF


Learning Datum-Wise Sampling Frequency for Energy-Efficient Human Activity Recognition

摘要

Continuous Human Activity Recognition (HAR) is an important application of smart mobile/wearable systems for providing dynamic assistance to users. However, HAR in real-time requires continuous sampling of data using built-in sensors (e.g., accelerometer), which significantly increases the energy cost and shortens the operating span. Reducing sampling rate can save energy but causes low recognition accuracy. Therefore, choosing adaptive sampling frequency that balances accuracy and energy efficiency becomes a critical problem in HAR. In this paper, we formalize the problem as minimizing both classification error and energy cost by choosing dynamically appropriate sampling rates. We propose Datum-Wise Frequency Selection (DWFS) to solve the problem via a continuous state Markov Decision Process (MDP). A policy function is learned from the MDP, which selects the best frequency for sampling an incoming data entity by exploiting a datum related state of the system. We propose a method for alternative learning the parameters of an activity classification model and the MDP that improves both the accuracy and the energy efficiency. We evaluate DWFS with three real-world HAR datasets, and the results show that DWFS statistically outperforms the state-of-the-arts regarding a combined measurement of accuracy and energy efficiency.

关键词

Human Activity Recognition; Markov Decision Process; Sampling Frequency

全文

PDF


A Neural Attention Model for Urban Air Quality Inference: Learning the Weights of Monitoring Stations

摘要

Urban air pollution has attracted much attention these years for its adverse impacts on human health. While monitoring stations have been established to collect pollutant statistics, the number of stations is very limited due to the high cost. Thus, inferring fine-grained urban air quality information is becoming an essential issue for both government and people. In this paper, we propose a generic neural approach, named ADAIN, for urban air quality inference. We leverage both the information from monitoring stations and urban data that are closely related to air quality, including POIs, road networks and meteorology. ADAIN combines feedforward and recurrent neural networks for modeling static and sequential features as well as capturing deep feature interactions effectively. A novel attempt of ADAIN is an attention-based pooling layer that automatically learns the weights of features from different monitoring stations, to boost the performance. We conduct experiments on a real-world air quality dataset and our approach achieves the highest performance compared with various state-of-the-art solutions.

关键词

air quality inference; deep neural networks; attention model

全文

PDF


Modeling Temporal Tonal Relations in Polyphonic Music Through Deep Networks With a Novel Image-Based Representation

摘要

We propose an end-to-end approach for modeling polyphonic music with a novel graphical representation, based on music theory, in a deep neural network. Despite the success of deep learning in various applications, it remains a challenge to incorporate existing domain knowledge in a network without affecting its training routines. In this paper we present a novel approach for predictive music modeling and music generation that incorporates domain knowledge in its representation. In this work, music is transformed into a 2D representation, inspired by tonnetz from music theory, which graphically encodes musical relationships between pitches. This representation is incorporated in a deep network structure consisting of multilayered convolutional neural networks (CNN, for learning an efficient abstract encoding of the representation) and recurrent neural networks with long short-term memory cells (LSTM, for capturing temporal dependencies in music sequences). We empirically evaluate the nature and the effectiveness of the network by using a dataset of classical music from various composers. We investigate the effect of parameters including the number of convolution feature maps, pooling strategies, and three configurations of the network: LSTM without CNN, LSTM with CNN (pre-trained vs. not pre-trained). Visualizations of the feature maps and filters in the CNN are explored, and a comparison is made between the proposed tonnetz-inspired representation and pianoroll, a commonly used representation of music in computational systems. Experimental results show that the tonnetz representation produces musical sequences that are more tonally stable and contain more repeated patterns than sequences generated by pianoroll-based models, a finding that is directly useful for tackling current challenges in music and AI such as smart music generation.

关键词

music; knowledge representation; CNN; LSTM; autoencoder

全文

PDF


Adversarial Network Embedding

摘要

Learning low-dimensional representations of networks has proved effective in a variety of tasks such as node classification, link prediction and network visualization. Existing methods can effectively encode different structural properties into the representations, such as neighborhood connectivity patterns, global structural role similarities and other high-order proximities. However, except for objectives to capture network structural properties, most of them suffer from lack of additional constraints for enhancing the robustness of representations. In this paper, we aim to exploit the strengths of generative adversarial networks in capturing latent features, and investigate its contribution in learning stable and robust graph representations. Specifically, we propose an Adversarial Network Embedding (ANE) framework, which leverages the adversarial learning principle to regularize the representation learning. It consists of two components, i.e., a structure preserving component and an adversarial learning component. The former component aims to capture network structural properties, while the latter contributes to learning robust representations by matching the posterior distribution of the latent representations to given priors. As shown by the empirical results, our method is competitive with or superior to state-of-the-art approaches on benchmark network embedding tasks.

关键词

Network Embedding; Adversarial Learning

全文

PDF


Collaborative Filtering With User-Item Co-Autoregressive Models

摘要

Deep neural networks have shown promise in collaborative filtering (CF). However, existing neural approaches are either user-based or item-based, which cannot leverage all the underlying information explicitly. We propose CF-UIcA, a neural co-autoregressive model for CF tasks, which exploits the structural correlation in the domains of both users and items. The co-autoregression allows extra desired properties to be incorporated for different tasks. Furthermore, we develop an efficient stochastic learning algorithm to handle large scale datasets. We evaluate CF-UIcA on two popular benchmarks: MovieLens 1M and Netflix, and achieve state-of-the-art performance in both rating prediction and top-N recommendation tasks, which demonstrates the effectiveness of CF-UIcA.

关键词

全文

PDF


The Shape of Art History in the Eyes of the Machine

摘要

How does the machine classify styles in art? And how does it relate to art historians' methods for analyzing style? Several studies showed the ability of the machine to learn and predict styles, such as Renaissance, Baroque, Impressionism, etc., from images of paintings. This implies that the machine can learn an internal representation encoding discriminative features through its visual analysis. However, such a representation is not necessarily interpretable. We conducted a comprehensive study of several of the state-of-the-art convolutional neural networks applied to the task of style classification on 67K images of paintings, and analyzed the learned representation through correlation analysis with concepts derived from art history. Surprisingly, the networks could place the works of art in a smooth temporal arrangement mainly based on learning style labels, without any a priori knowledge of time of creation, the historical time and context of styles, or relations between styles. The learned representations showed that there are a few underlying factors that explain the visual variations of style in art. Some of these factors were found to correlate with style patterns suggested by Heinrich Wölfflin (1846-1945). The learned representations also consistently highlighted certain artists as the extreme distinctive representative of their styles, which quantitatively confirms art historian observations.

关键词

computational art history, applications of AI, Art and AI

全文

PDF


Multi-Step Time Series Generator for Molecular Dynamics

摘要

Molecular dynamics (MD) is a powerful computational method for simulating molecular behavior. Deep neural networks provide a novel method of generating MD data efficiently, but there is no architecture that mitigates the well-known exposure bias accumulated by multi-step generations. In this paper, we propose a multi-step time series generator using a deep neural network based on Wasserstein generative adversarial nets. Instead of sparse real data, our model evolves a latent variable z that is densely distributed in a low-dimensional space. This novel framework successfully mitigates the exposure bias. Moreover, our model can evolve part of the system (Feature extraction) with any time step (Step skip), which accelerates the efficient generation of MD data. The applicability of this model is evaluated through three different systems: harmonic oscillator, bulk water, and polymer melts. The experimental results demonstrate that our model can generate time series of the MD data with sufficient accuracy to calculate the physical and important dynamical statistics.

关键词

Molecular dynamics; Deep Learning; Generative Adversarial Nets

全文

PDF


Search Action Sequence Modeling With Long Short-Term Memory for Search Task Success Evaluation

摘要

Search task success rate is a crucial metric based on the search experience of users to measure the performance of search systems. Modeling search action sequence would help to capture the latent search patterns of users in successful and unsuccessful search tasks. Existing approaches use aggregated features to describe the user behavior in search action sequences, which depend on heuristic hand-crafted feature design and ignore a lot of information inherent in the user behavior. In this paper, we employ Long Short-Term Memory (LSTM) that performs end-to-end fine-tuning during the training to learn search action sequence representation for search task success evaluation. Concretely, we normalize the search action sequences by introducing a dummy idle action, which guarantees that the time intervals between contiguous actions are fixed. Simultaneously, we propose a novel data augmentation strategy to increase the pattern variations on search action sequence data to improve the generalization ability of LSTM. We evaluate the proposed approach on open datasets with two different definitions of search task success. The experimental results show that the proposed approach achieves significant performance improvement compared with several excellent search task success evaluation approaches.

关键词

全文

PDF


Discriminant Projection Representation-Based Classification for Vision Recognition

摘要

Representation-based classification methods such as sparse representation-based classification (SRC) and linear regression classification (LRC) have attracted a lot of attentions. In order to obtain the better representation, a novel method called projection representation-based classification (PRC) is proposed for image recognition in this paper. PRC is based on a new mathematical model. This model denotes that the "ideal projection" of a sample point x on the hyper-space H may be gained by iteratively computing the projection of x on a line of hyper-space H with the proper strategy. Therefore, PRC is able to iteratively approximate the "ideal representation" of each subject for classification. Moreover, the discriminant PRC (DPRC) is further proposed, which obtains the discriminant information by maximizing the ratio of the between-class reconstruction error over the within-class reconstruction error. Experimental results on five typical databases show that the proposed PRC and DPRC are effective and outperform other state-of-the-art methods on several vision recognition tasks.

关键词

Sparse Representation; Linear Regression;Image Classification;Object Recognition

全文

PDF


The Geometric Block Model

摘要

To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model generalizes the random geometric graphs in the same way that the well-studied stochastic block model generalizes the Erdös-Renyi random graphs. It is also a natural extension of random community models inspired by the recent theoretical and practical advancement in community detection. While being a topic of fundamental theoretical interest, our main contribution is to show that many practical community structures are better explained by the geometric block model. We also show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal. Indeed, even in the regime where the average degree of the graph grows only logarithmically with the number of vertices (sparse-graph), we show that this algorithm performs extremely well, both theoretically and practically. In contrast, the triangle-counting algorithm is far from being optimum for the stochastic block model. We simulate our results on both real and synthetic datasets to show superior performance of both the new model as well as our algorithm.

关键词

全文

PDF


Group-Pair Convolutional Neural Networks for Multi-View Based 3D Object Retrieval

摘要

In recent years, research interest in object retrieval has shifted from 2D towards 3D data. Despite many well-designed approaches, we point out that limitations still exist and there is tremendous room for improvement, including the heavy reliance on hand-crafted features, the separated optimization of feature extraction and object retrieval, and the lack of sufficient training samples. In this work, we address the above limitations for 3D object retrieval by developing a novel end-to-end solution named Group Pair Convolutional Neural Network (GPCNN). It can jointly learn the visual features from multiple views of a 3D model and optimize towards the object retrieval task. To tackle the insufficient training data issue, we innovatively employ a pair-wise learning scheme, which learns model parameters from the similarity of each sample pair, rather than the traditional way of learning from sparse label–sample matching. Extensive experiments on three public benchmarks show that our GPCNN solution significantly outperforms the state-of-the-art methods with 3% to 42% improvement in retrieval accuracy.

关键词

3D object retrieval; Group-pair CNN;

全文

PDF


Dependence Guided Unsupervised Feature Selection

摘要

In the past decade, various sparse learning based unsupervised feature selection methods have been developed. However, most existing studies adopt a two-step strategy, i.e., selecting the top-m features according to a calculated descending order and then performing K-means clustering, resulting in a group of sub-optimal features. To address this problem, we propose a Dependence Guided Unsupervised Feature Selection (DGUFS) method to select features and partition data in a joint manner. Our proposed method enhances the inter-dependence among original data, cluster labels, and selected features. In particular, a projection-free feature selection model is proposed based on l20-norm equality constraints. We utilize the learned cluster labels to fill in the information gap between original data and selected features. Two dependence guided terms are consequently proposed for our model. More specifically, one term increases the dependence of desired cluster labels on original data, while the other term maximizes the dependence of selected features on cluster labels to guide the process of feature selection. Last but not least, an iterative algorithm based on Alternating Direction Method of Multipliers (ADMM) is designed to solve the constrained minimization problem efficiently. Extensive experiments on different datasets consistently demonstrate that our proposed method significantly outperforms state-of-the-art baselines.

关键词

全文

PDF


On Trivial Solution and High Correlation Problems in Deep Supervised Hashing

摘要

Deep supervised hashing (DSH), which combines binary learning and convolutional neural network, has attracted considerable research interests and achieved promising performance for highly efficient image retrieval. In this paper, we show that the widely used loss functions, pair-wise loss and triplet loss, suffer from the trivial solution problem and usually lead to highly correlated bits in practice, limiting the performance of DSH. One important reason is that it is difficult to incorporate proper constraints into the loss functions under the mini-batch based optimization algorithm. To tackle these problems, we propose to adopt ensemble learning strategy for deep model training. We found out that this simple strategy is capable of effectively decorrelating different bits, making the hashcodes more informative. Moreover, it is very easy to parallelize the training and support incremental model learning, which are very useful for real-world applications but usually ignored by existing DSH approaches. Experiments on benchmarks demonstrate the proposed ensemble based DSH can improve the performance of DSH approaches significant.

关键词

Hashing;Deep Learning; Neural Network

全文

PDF


Learning User Preferences to Incentivize Exploration in the Sharing Economy

摘要

We study platforms in the sharing economy and discuss the need for incentivizing users to explore options that otherwise would not be chosen. For instance, rental platforms such as Airbnb typically rely on customer reviews to provide users with relevant information about different options. Yet, often a large fraction of options does not have any reviews available. Such options are frequently neglected as viable choices, and in turn are unlikely to be evaluated, creating a vicious cycle. Platforms can engage users to deviate from their preferred choice by offering monetary incentives for choosing a different option instead. To efficiently learn the optimal incentives to offer, we consider structural information in user preferences and introduce a novel algorithm---Coordinated Online Learning (CoOL)---for learning with structural information modeled as convex constraints. We provide formal guarantees on the performance of our algorithm and test the viability of our approach in a user study with data of apartments on Airbnb. Our findings suggest that our approach is well-suited to learn appropriate incentives and increase exploration on the investigated platform.

关键词

Online Learning; Sharing Economy; Incentives; Exploration

全文

PDF


Video-Based Sign Language Recognition Without Temporal Segmentation

摘要

Millions of hearing impaired people around the world routinely use some variants of sign languages to communicate, thus the automatic translation of a sign language is meaningful and important. Currently, there are two sub-problems in Sign Language Recognition (SLR), i.e., isolated SLR that recognizes word by word and continuous SLR that translates entire sentences. Existing continuous SLR methods typically utilize isolated SLRs as building blocks, with an extra layer of preprocessing (temporal segmentation) and another layer of post-processing (sentence synthesis). Unfortunately, temporal segmentation itself is non-trivial and inevitably propagates errors into subsequent steps. Worse still, isolated SLR methods typically require strenuous labeling of each word separately in a sentence, severely limiting the amount of attainable training data. To address these challenges, we propose a novel continuous sign recognition framework, the Hierarchical Attention Network with Latent Space (LS-HAN), which eliminates the preprocessing of temporal segmentation. The proposed LS-HAN consists of three components: a two-stream Convolutional Neural Network (CNN) for video feature representation generation, a Latent Space (LS) for semantic gap bridging, and a Hierarchical Attention Network (HAN) for latent space based recognition. Experiments are carried out on two large scale datasets. Experimental results demonstrate the effectiveness of the proposed framework.

关键词

Sign Language Recognition

全文

PDF


Energy-Efficient Automatic Train Driving by Learning Driving Patterns

摘要

Railway is regarded as the most sustainable means of modern transportation. With the fast-growing of fleet size and the railway mileage, the energy consumption of trains is becoming a serious concern globally. The nature of railway offers a unique opportunity to optimize the energy efficiency of locomotives by taking advantage of the undulating terrains along a route. The derivation of an energy-optimal train driving solution, however, proves to be a significant challenge due to the high dimension, nonlinearity, complex constraints, and time-varying characteristic of the problem. An optimized solution can only be attained by considering both the complex environmental conditions of a given route and the inherent characteristics of a locomotive. To tackle the problem, this paper employs a high-order correlation learning method for online generation of the energy optimized train driving solutions. Based on the driving data of experienced human drivers, a hypergraph model is used to learn the optimal embedding from the specified features for the decision of a driving operation. First, we design a feature set capturing the driving status. Next all the training data are formulated as a hypergraph and an inductive learning process is conducted to obtain the embedding matrix. The hypergraph model can be used for real-time generation of driving operation. We also proposed a reinforcement updating scheme, which offers the capability of sustainable enhancement on the hypergraph model in industrial applications. The learned model can be used to determine an optimized driving operation in real-time tested on the Hardware-in-Loop platform. Validation experiments proved that the energy consumption of the proposed solution is around 10% lower than that of average human drivers.

关键词

Automatic Train Driving; Hypergraph Learning

全文

PDF


Video-Based Person Re-Identification via Self Paced Weighting

摘要

Person re-identification (re-id) is a fundamental technique to associate various person images, captured by differentsurveillance cameras, to the same person. Compared to the single image based person re-id methods, video-based personre-id has attracted widespread attentions because extra space-time information and more appearance cues that can beused to greatly improve the matching performance. However, most existing video-based person re-id methods equally treatall video frames, ignoring their quality discrepancy caused by object occlusion and motions, which is a common phenomenonin real surveillance scenario. Based on this finding, we propose a novel video-based person re-id method via self paced weighting (SPW). Firstly, we propose a self paced outlier detection method to evaluate the noise degree of video sub sequences. Thereafter, a weighted multi-pair distance metric learning approach is adopted to measure the distance of two person image sequences. Experimental results on two public datasets demonstrate the superiority of the proposed method over current state-of-the-art work.

关键词

Video-based Person Re-identification;Self-Paced Learning;

全文

PDF


Generating Music Medleys via Playing Music Puzzle Games

摘要

Generating music medleys is about finding an optimal permutation of a given set of music clips. Toward this goal, we propose a self-supervised learning task, called the music puzzle game, to train neural network models to learn the sequential patterns in music. In essence, such a game requires machines to correctly sort a few multisecond music fragments. In the training stage, we learn the model by sampling multiple non-overlapping fragment pairs from the same songs and seeking to predict whether a given pair is consecutive and is in the correct chronological order. For testing, we design a number of puzzle games with different difficulty levels, the most difficult one being music medley, which requiring sorting fragments from different songs. On the basis of state-of-the-art Siamese convolutional network, we propose an improved architecture that learns to embed frame-level similarity scores computed from the input fragment pairs to a common space, where fragment pairs in the correct order can be more easily identified. Our result shows that the resulting model, dubbed as the similarity embedding network (SEN), performs better than competing models across different games, including music jigsaw puzzle, music sequencing, and music medley. Example results can be found at our project website, https://remyhuang.github.io/DJnet.

关键词

Music Medley, Music Puzzle, Siamese Network, self-supervised learning

全文

PDF


Link Prediction With Personalized Social Influence

摘要

Link prediction in social networks is to infer the new links likely to be formed next or to reconstruct the links that are currently missing. Other than the pure topological network structures, social networks are often associated with rich information of social activities of users, such as tweeting, retweeting, and replying. Social theories such as social influence indicate that social activities could have potential impacts on the neighbors, and links in social media could be the results of the social influence among users. It motivates us to learn and model social influence among users to tackle the link prediction problem. However, this is a non-trivial task since it is challenging to model heterogeneous social activities. Traditional methods often define universal metrics of social influence for all users, but even for the same activity of a user, the influence towards different neighbors might not be the same. It motivates a personalized learning schema. In information theory, if a time-series signal influences another, then the uncertainty in the latter one will be reduced, given the distribution of the former one. Thus, we are motivated to learn social influence based on the timestamps of social activities. Given the timestamps of each user, we use entropy to measure the reduction of uncertainty of his/her neighbors. The learned social influence is then incorporated into a graph based link prediction model to perform joint learning. Through comprehensive experiments, we demonstrate that the proposed framework can perform better than the state-of-the-art methods on different real-world networks.

关键词

Link Prediction; Social Networks; Social Influence

全文

PDF


Task-Aware Compressed Sensing With Generative Adversarial Networks

摘要

In recent years, neural network approaches have been widely adopted for machine learning tasks, with applications in computer vision. More recently, unsupervised generative models based on neural networks have been successfully applied to model data distributions via low-dimensional latent spaces. In this paper, we use Generative Adversarial Networks (GANs) to impose structure in compressed sensing problems, replacing the usual sparsity constraint. We propose to train the GANs in a task-aware fashion, specifically for reconstruction tasks. We also show that it is possible to train our model without using any (or much) non-compressed data. Finally, we show that the latent space of the GAN carries discriminative information and can further be regularized to generate input features for general inference tasks. We demonstrate the effectiveness of our method on a variety of reconstruction and classification problems.

关键词

compressive; sensing; compressed; generative; GAN; task-aware;

全文

PDF


Context-Aware Symptom Checking for Disease Diagnosis Using Hierarchical Reinforcement Learning

摘要

Online symptom checkers have been deployed by sites such as WebMD and Mayo Clinic to identify possible causes and treatments for diseases based on a patient’s symptoms. Symptom checking first assesses a patient by asking a series of questions about their symptoms, then attempts to predict potential diseases. The two design goals of a symptom checker are to achieve high accuracy and intuitive interactions. In this paper we present our context-aware hierarchical reinforcement learning scheme, which significantly improves accuracy of symptom checking over traditional systems while also making a limited number of inquiries.

关键词

全文

PDF


DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks

摘要

Survival analysis (time-to-event analysis) is widely used in economics and finance, engineering, medicine and many other areas. A fundamental problem is to understand the relationship between the covariates and the (distribution of) survival times(times-to-event). Much of the previous work has approached the problem by viewing the survival time as the first hitting time of a stochastic process, assuming a specific form for the underlying stochastic process, using available data to learn the relationship between the covariates and the parameters of the model, and then deducing the relationship between covariates and the distribution of first hitting times (the risk). However, previous models rely on strong parametric assumptions that are often violated. This paper proposes a very different approach to survival analysis, DeepHit, that uses a deep neural network to learn the distribution of survival times directly.DeepHit makes no assumptions about the underlying stochastic process and allows for the possibility that the relationship between covariates and risk(s) changes over time. Most importantly, DeepHit smoothly handles competing risks; i.e. settings in which there is more than one possible event of interest.Comparisons with previous models on the basis of real and synthetic datasets demonstrate that DeepHit achieves large and statistically significant performance improvements over previous state-of-the-art methods.

关键词

survival analysis; competing risks; neural networks; first-hitting-time analysis

全文

PDF


DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices

摘要

Deploying deep neural networks on mobile devices is a challenging task. Current model compression methods such as matrix decomposition effectively reduce the deployed model size, but still cannot satisfy real-time processing requirement. This paper first discovers that the major obstacle is the excessive execution time of non-tensor layers such as pooling and normalization without tensor-like trainable parameters. This motivates us to design a novel acceleration framework: DeepRebirth through "slimming" existing consecutive and parallel non-tensor and tensor layers. The layer slimming is executed at different substructures: (a) streamline slimming by merging the consecutive non-tensor and tensor layer vertically; (b) branch slimming by merging non-tensor and tensor branches horizontally. The proposed optimization operations significantly accelerate the model execution and also greatly reduce the run-time memory cost since the slimmed model architecture contains less hidden layers. To maximally avoid accuracy loss, the parameters in new generated layers are learned with layer-wise fine-tuning based on both theoretical analysis and empirical verification. As observed in the experiment, DeepRebirth achieves more than 3x speed-up and 2.5x run-time memory saving on GoogLeNet with only 0.4% drop on top-5 accuracy in ImageNet. Furthermore, by combining with other model compression techniques, DeepRebirth offers an average of 106.3ms inference time on the CPU of Samsung Galaxy S5 with 86.5% top-5 accuracy, 14% faster than SqueezeNet which only has a top-5 accuracy of 80.5%.

关键词

deep model acceleration; model compression; mobile deep learning

全文

PDF


Discriminative Semi-Coupled Projective Dictionary Learning for Low-Resolution Person Re-Identification

摘要

Person re-identification (re-ID) is a fundamental task in automated video surveillance. In real-world visual surveillance systems, a person is often captured in quite low resolutions. So we often need to perform low-resolution person re-ID, where images captured by different cameras have great resolution divergences. Existing methods cope problem via some complicated and time-consuming strategies, making them less favorable in practice, and their performances are far from satisfactory. In this paper, we design a novel Discriminative Semi-coupled Projective Dictionary Learning (DSPDL) model to effectively and efficiently solve this problem. Specifically, we propose to jointly learn a pair of dictionaries and a mapping to bridge the gap across low(er) and high(er) resolution person images. Besides, we develop a novel graph regularizer to incorporate positive and negative image pair information in a parameterless fashion. Meanwhile, we adopt the efficient and powerful projective dictionary learning technique to boost the our efficiency. Experiments on three public datasets show the superiority of the proposed method to the state-of-the-art ones.

关键词

全文

PDF


Unified Locally Linear Classifiers With Diversity-Promoting Anchor Points

摘要

Locally Linear Support Vector Machine (LLSVM) has been actively used in classification tasks due to its capability of classifying nonlinear patterns. However, existing LLSVM suffers from two drawbacks: (1) a particular and appropriate regularization for LLSVM has not yet been addressed; (2) it usually adopts a three-stage learning scheme composed of learning anchor points by clustering, learning local coding coordinates by a predefined coding scheme, and finally learning for training classifiers. We argue that this decoupled approaches oversimplifies the original optimization problem, resulting in a large deviation due to the disparate purpose of each step. To address the first issue, we propose a novel diversified regularization which could capture infrequent patterns and reduce the model size without sacrificing the representation power. Based on this regularization, we develop a joint optimization algorithm among anchor points, local coding coordinates and classifiers to simultaneously minimize the overall classification risk, which is termed as Diversified and Unified Locally Linear Support Vector Machine (DU-LLSVM for short). To the best of our knowledge, DU-LLSVM is the first principled method that directly learns sparse local coding and can be easily generalized to other supervised learning models. Extensive experiments showed that DU-LLSVM consistently surpassed several state-of-the-art methods with a predefined local coding scheme (e.g. LLSVM) or a supervised anchor point learning (e.g. SAPL-LLSVM).

关键词

Classification; Manifold Learning; Support Vector Machine

全文

PDF


Multi-Modal Multi-Task Learning for Automatic Dietary Assessment

摘要

We investigate the task of automatic dietary assessment: given meal images and descriptions uploaded by real users, our task is to automatically rate the meals and deliver advisory comments for improving users' diets. To address this practical yet challenging problem, which is multi-modal and multi-task in nature, an end-to-end neural model is proposed. In particular, comprehensive meal representations are obtained from images, descriptions and user information. We further introduce a novel memory network architecture to store meal representations and reason over the meal representations to support predictions. Results on a real-world dataset show that our method outperforms two strong image captioning baselines significantly.

关键词

Dietary Assessment, Multi-modal Learning, Memory Network

全文

PDF


Distance-Aware DAG Embedding for Proximity Search on Heterogeneous Graphs

摘要

Proximity search on heterogeneous graphs aims to measure the proximity between two nodes on a graph w.r.t. some semantic relation for ranking. Pioneer work often tries to measure such proximity by paths connecting the two nodes. However, paths as linear sequences have limited expressiveness for the complex network connections. In this paper, we explore a more expressive DAG (directed acyclic graph) data structure for modeling the connections between two nodes. Particularly, we are interested in learning a representation for the DAGs to encode the proximity between two nodes. We face two challenges to use DAGs, including how to efficiently generate DAGs and how to effectively learn DAG embedding for proximity search. We find distance-awareness as important for proximity search and the key to solve the above challenges. Thus we develop a novel Distance-aware DAG Embedding (D2AGE) model. We evaluate D2AGE on three benchmark data sets with six semantic relations, and we show that D2AGE outperforms the state-of-the-art baselines. We release the code on https://github.com/shuaiOKshuai.

关键词

semantic proximity search; heterogeneous graph; DAG embedding

全文

PDF


Semi-Supervised Biomedical Translation With Cycle Wasserstein Regression GANs

摘要

The biomedical field offers many learning tasks that share unique challenges: large amounts of unpaired data, and a high cost to generate labels. In this work, we develop a method to address these issues with semi-supervised learning in regression tasks (e.g., translation from source to target). Our model uses adversarial signals to learn from unpaired datapoints, and imposes a cycle-loss reconstruction error penalty to regularize mappings in either direction against one another. We first evaluate our method on synthetic experiments, demonstrating two primary advantages of the system: 1) distribution matching via the adversarial loss and 2) regularization towards invertible mappings via the cycle loss. We then show a regularization effect and improved performance when paired data is supplemented by additional unpaired data on two real biomedical regression tasks: estimating the physiological effect of medical treatments, and extrapolating gene expression (transcriptomics) signals. Our proposed technique is a promising initial step towards more robust use of adversarial signals in semi-supervised regression, and could be useful for other tasks (e.g., causal inference or modality translation) in the biomedical field.

关键词

clinical informatics, biomedical informatics, semi-supervised learning, generative adversarial networks, GANs

全文

PDF


Probabilistic Ensemble of Collaborative Filters

摘要

Collaborative filtering is an important technique for recommendation. Whereas it has been repeatedly shown to be effective in previous work,its performance remains unsatisfactory in many real-world applications, especially those where the items or users are highly diverse. In this paper, we explore an ensemble-based framework to enhance thecapability of a recommender in handling diverse data. Specifically, we formulate a probabilistic model which integrates the items, the users, as well as the associations between them into a generative process. On top of this formulation, we further derive a progressive algorithm to construct an ensemble of collaborative filters. In each iteration, a new filter is derived from re-weighted entries and incorporated into the ensemble. It is noteworthy that while the algorithmic procedure of our algorithm is apparently similar to boosting, it is derived from an essentially different formulation and thus differs in several key technical aspects. We tested the proposed method on three large datasets, and observed substantial improvement over the state of the art, including L2Boost, an effective method based on boosting.

关键词

Recommendation Systems; Ensemble

全文

PDF


A Combinatorial-Bandit Algorithm for the Online Joint Bid/Budget Optimization of Pay-per-Click Advertising Campaigns

摘要

Pay-per-click advertising includes various formats (e.g., search, contextual, and social) with a total investment of more than 140 billion USD per year. An advertising campaign is composed of some subcampaigns-each with a different ad-and a cumulative daily budget. The allocation of the ads is ruled exploiting auction mechanisms. In this paper, we propose, for the first time to the best of our knowledge, an algorithm for the online joint bid/budget optimization of pay-per-click multi-channel advertising campaigns. We formulate the optimization problem as a combinatorial bandit problem, in which we use Gaussian Processes to estimate stochastic functions, Bayesian bandit techniques to address the exploration/exploitation problem, and a dynamic programming technique to solve a variation of the Multiple-Choice Knapsack problem. We experimentally evaluate our algorithm both in simulation-using a synthetic setting generated from real data from Yahoo!-and in a real-world application over an advertising period of two months.

关键词

Internet Advertising; Combinatorial MAB; Gaussian Process

全文

PDF


Hierarchical Video Generation From Orthogonal Information: Optical Flow and Texture

摘要

Learning to represent and generate videos from unlabeled data is a very challenging problem. To generate realistic videos, it is important not only to ensure that the appearance of each frame is real, but also to ensure the plausibility of a video motion and consistency of a video appearance in the time direction. The process of video generation should be divided according to these intrinsic difficulties. In this study, we focus on the motion and appearance information as two important orthogonal components of a video, and propose Flow-and-Texture-Generative Adversarial Networks (FTGAN) consisting of FlowGAN and TextureGAN. In order to avoid a huge annotation cost, we have to explore a way to learn from unlabeled data. Thus, we employ optical flow as motion information to generate videos. FlowGAN generates optical flow, which contains only the edge and motion of the videos to be begerated. On the other hand, TextureGAN specializes in giving a texture to optical flow generated by FlowGAN. This hierarchical approach brings more realistic videos with plausible motion and appearance consistency. Our experiments show that our model generates more plausible motion videos and also achieves significantly improved performance for unsupervised action classification in comparison to previous GAN works. In addition, because our model generates videos from two independent information, our model can generate new combinations of motion and attribute that are not seen in training data, such as a video in which a person is doing sit-up in a baseball ground.

关键词

Deep Learning;Neural Networks;Videos

全文

PDF


Sequence-to-Sequence Learning via Shared Latent Representation

摘要

Sequence-to-sequence learning is a popular research area in deep learning, such as video captioning and speech recognition. Existing methods model this learning as a mapping process by first encoding the input sequence to a fixed-sized vector, followed by decoding the target sequence from the vector. Although simple and intuitive, such mapping model is task-specific, unable to be directly used for different tasks. In this paper, we propose a star-like framework for general and flexible sequence-to-sequence learning, where different types of media contents (the peripheral nodes) could be encoded to and decoded from a shared latent representation (SLR) (the central node). This is inspired by the fact that human brain could learn and express an abstract concept in different ways. The media-invariant property of SLR could be seen as a high-level regularization on the intermediate vector, enforcing it to not only capture the latent representation intra each individual media like the auto-encoders, but also their transitions like the mapping models. Moreover, the SLR model is content-specific, which means it only needs to be trained once for a dataset, while used for different tasks. We show how to train a SLR model via dropout and use it for different sequence-to-sequence tasks. Our SLR model is validated on the Youtube2Text and MSR-VTT datasets, achieving superior performance on video-to-sentence task, and the first sentence-to-video results.

关键词

全文

PDF


Compatibility Family Learning for Item Recommendation and Generation

摘要

Compatibility between items, such as clothes and shoes, is a major factor among customer's purchasing decisions. However, learning "compatibility" is challenging due to (1) broader notions of compatibility than those of similarity, (2) the asymmetric nature of compatibility, and (3) only a small set of compatible and incompatible items are observed. We propose an end-to-end trainable system to embed each item into a latent vector and project a query item into K compatible prototypes in the same space. These prototypes reflect the broad notions of compatibility. We refer to both the embedding and prototypes as "Compatibility Family." In our learned space, we introduce a novel Projected Compatibility Distance (PCD) function which is differentiable and ensures diversity by aiming for at least one prototype to be close to a compatible item, whereas none of the prototypes are close to an incompatible item. We evaluate our system on a toy dataset, two Amazon product datasets, and Polyvore outfit dataset. Our method consistently achieves state-of-the-art performance. Finally, we show that we can visualize the candidate compatible prototypes using a Metric-regularized Conditional Generative Adversarial Network (MrCGAN), where the input is a projected prototype and the output is a generated image of a compatible item. We ask human evaluators to judge the relative compatibility between our generated images and images generated by CGANs conditioned directly on query items. Our generated images are significantly preferred, with roughly twice the number of votes as others.

关键词

Fashion; Compatibility Learning; Generative Adversarial Networks

全文

PDF


Neural Ideal Point Estimation Network

摘要

Understanding politics is challenging because the politics take the influence from everything. Even we limit ourselves to the political context in the legislative processes; we need a better understanding of latent factors, such as legislators, bills, their ideal points, and their relations. From the modeling perspective, this is difficult 1) because these observations lie in a high dimension that requires learning on low dimensional representations, and 2) because these observations require complex probabilistic modeling with latent variables to reflect the causalities. This paper presents a new model to reflect and understand this political setting, NIPEN, including factors mentioned above in the legislation. We propose two versions of NIPEN: one is a hybrid model of deep learning and probabilistic graphical model, and the other model is a neural tensor model. Our result indicates that NIPEN successfully learns the manifold of the legislative bill's text, and NIPEN utilizes the learned low-dimensional latent variables to increase the prediction performance of legislators' votings. Additionally, by virtue of being a domain-rich probabilistic model, NIPEN shows the hidden strength of the legislators' trust network and their various characteristics on casting votes.

关键词

Ideal point estimation; Legislative voting; Variational autoencoder

全文

PDF


Nonlocal Patch Based t-SVD for Image Inpainting: Algorithm and Error Analysis

摘要

In this paper, we propose a novel image inpainting framework consisting of an interpolation step and a low-rank tensor completion step. More specifically, we first initial the image with triangulation-based linear interpolation, and then we find similar patches for each missing-entry centered patch. Treating a group of patch matrices as a tensor, we employ the recently proposed effective t-SVD tensor completion algorithm with a warm start strategy to inpaint it. We observe that the interpolation step is such a rough initialization that the similar patch we found may not exactly match with the reference, so we name the problem as Patch Mismatch and analyse the error caused by it thoroughly. Our theoretical analysis shows that the error caused by Patch Mismatch can be decomposed into two components, one of which can be bounded by a reasonable assumption named local patch similarity, and another part is lower than that using matrix. Experiments on real images verify our method's superiority to the state-of-the-art inpainting methods.

关键词

全文

PDF


r-BTN: Cross-Domain Face Composite and Synthesis From Limited Facial Patches

摘要

Recent face composite and synthesis related works have shown promising results in generating realistic face images from deep convolutional networks. However, these works either do not generate consistent results when the constituent patches contain large domain variations (i.e., from face and sketch domains) or cannot generate high-resolution images with limited facial patches (e.g., the inpainting approach tends to blur the generated region when the missing area is more than 50%). Motivated by the mental imagery and simulation in human cognition, we exploit the potential of deep learning networks in filling large missing region (e.g., as high as 95% missing) and generating realistic faces with high fidelity in cross domains.We propose the recursive generation by bidirectional transformation networks (r-BTN) that recursively generates a whole face/sketch from a small sketch/face patch. The large missing area and domain variations make it difficult to generate satisfactory results using a unidirectional cross-domain learning structure. We explore that the bidirectional transformation network can lead to the consistent result by minimizing the forward and backward errors in the cross-domain scenario. On the other hand, a forward and backward bidirectional learning between the face and sketch domains would enable recursive estimation of the missing region in an incremental manner to yield appealing results. r-BTN also adopts an adversarial constraint to encourage the generation of realistic faces/sketches. Extensive experiments have been conducted to demonstrate the superior performance from r-BTN as compared to existing potential solutions.

关键词

Generative Adversarial Network; Image Inpainting; Cross Domain; Image Transformation

全文

PDF


Exercise-Enhanced Sequential Modeling for Student Performance Prediction

摘要

In online education systems, for offering proactive services to students (e.g., personalized exercise recommendation), a crucial demand is to predict student performance (e.g., scores) on future exercising activities. Existing prediction methods mainly exploit the historical exercising records of students, where each exercise is usually represented as the manually labeled knowledge concepts, and the richer information contained in the text description of exercises is still underexplored. In this paper, we propose a novel Exercise-Enhanced Recurrent Neural Network (EERNN) framework for student performance prediction by taking full advantage of both student exercising records and the text of each exercise. Specifically, for modeling the student exercising process, we first design a bidirectional LSTM to learn each exercise representation from its text description without any expertise and information loss. Then, we propose a new LSTM architecture to trace student states (i.e., knowledge states) in their sequential exercising process with the combination of exercise representations. For making final predictions, we design two strategies under EERNN, i.e., EERNNM with Markov property and EERNNA with Attention mechanism. Extensive experiments on large-scale real-world data clearly demonstrate the effectiveness of EERNN framework. Moreover, by incorporating the exercise correlations, EERNN can well deal with the cold start problems from both student and exercise perspectives.

关键词

Student Performance Prediction; Exercise Text; Recurrent Neural Network

全文

PDF


Compressed Sensing MRI Using a Recursive Dilated Network

摘要

Compressed sensing magnetic resonance imaging (CS-MRI) is an active research topic in the field of inverse problems. Conventional CS-MRI algorithms usually exploit the sparse nature of MRI in an iterative manner. These optimization-based CS-MRI methods are often time-consuming at test time, and are based on fixed transform bases or shallow dictionaries, which limits modeling capacity. Recently, deep models have been introduced to the CS-MRI problem. One main challenge for CS-MRI methods based on deep learning is the trade off between model performance and network size. We propose a recursive dilated network (RDN) for CS-MRI that achieves good performance while reducing the number of network parameters. We adopt dilated convolutions in each recursive block to aggregate multi-scale information within the MRI. We also adopt a modified shortcut strategy to help features flow into deeper layers. Experimental results show that the proposed RDN model achieves state-of-the-art performance in CS-MRI while using far fewer parameters than previously required.

关键词

compressed sensing; magnetic resonance imaging; deep neural networks

全文

PDF


Mesh-Based Autoencoders for Localized Deformation Component Analysis

摘要

Spatially localized deformation components are very useful for shape analysis and synthesis in 3D geometry processing. Several methods have recently been developed, with an aim to extract intuitive and interpretable deformation components. However, these techniques suffer from fundamental limitations especially for meshes with noise or large-scale deformations, and may not always be able to identify important deformation components.In this paper we propose a novel mesh-based autoencoder architecture that is able to cope with meshes with irregular topology. We introduce sparse regularization in this framework, which along with convolutional operations, helps localize deformations.Our framework is capable of extracting localized deformation components from mesh data sets with large-scale deformations and is robust to noise. It also provides a nonlinear approach to reconstruction of meshes using the extracted basis, which is more effective than the current linear combination approach. Extensive experiments show that our method outperforms state-of-the-art methods in both qualitative and quantitative evaluations.

关键词

Mesh; Deformation; Autoencoder; Localized; Sparsity

全文

PDF


Maximum-Variance Total Variation Denoising for Interpretable Spatial Smoothing

摘要

We consider the problem of spatial regression where interpretability of the model is a high priority. Such problems appear frequently in a diverse set of fields from climatology to epidemiology to predictive policing. For cognitive, logistical, and organizational reasons, humans tend to infer regions or neighborhoods of constant value, often with sharp discontinuities between regions, and then assign resources on a per-region basis. Automating this smoothing process presents a unique challenge for spatial smoothing algorithms, which tend to assume stationarity and smoothness everywhere. To address this problem, we propose Maximum Variance Total Variation (MVTV) denoising, a novel method for interpretable nonlinear spatial regression. MVTV divides the feature space into blocks of constant value and smooths the value of all blocks jointly via a convex optimization routine. Our method is fully data-adaptive and incorporates highly robust routines for tuning all hyperparameters automatically. We compare our approach against the existing CART and CRISP methods via both a complexity-accuracy tradeoff metric and a human study, demonstrating that that MVTV is a more powerful and interpretable method.

关键词

interpretability; spatial smoothing

全文

PDF


Differential Performance Debugging With Discriminant Regression Trees

摘要

Differential performance debugging is a technique to find performance problems. It applies in situations where the performance of a program is (unexpectedly) different for different classes of inputs. The task is to explain the differences in asymptotic performance among various input classes in terms of program internals. We propose a data-driven technique based on discriminant regression tree (DRT) learning problem where the goal is to discriminate among different classes of inputs. We propose a new algorithm for DRT learning that first clusters the data into functional clusters, capturing different asymptotic performance classes, and then invokes off-the-shelf decision tree learning algorithms to explain these clusters. We focus on linear functional clusters and adapt classical clustering algorithms (K-means and spectral) to produce them. For the K-means algorithm, we generalize the notion of the cluster centroid from a point to a linear function. We adapt spectral clustering by defining a novel kernel function to capture the notion of linear similarity between two data points. We evaluate our approach on benchmarks consisting of Java programs where we are interested in debugging performance. We show that our algorithm significantly outperforms other well-known regression tree learning algorithms in terms of running time and accuracy of classification.

关键词

Regression Trees; Functional Clustering; Spectrum-based Performance Bug Localization

全文

PDF


Adversarial Zero-shot Learning With Semantic Augmentation

摘要

In situations in which labels are expensive or difficult to obtain, deep neural networks for object recognition often suffer to achieve fair performance. Zero-shot learning is dedicated to this problem. It aims to recognize objects of unseen classes by transferring knowledge from seen classes via a shared intermediate representation. Using the manifold structure of seen training samples is widely regarded as important to learn a robust mapping between samples and the intermediate representation, which is crucial for transferring the knowledge. However, their irregular structures, such as the lack in variation of samples for certain classes and highly overlapping clusters of different classes, may result in an inappropriate mapping. Additionally, in a high dimensional mapping space, the hubness problem may arise, in which one of the unseen classes has a high possibility to be assigned to samples of different classes. To mitigate such problems, we use a generative adversarial network to synthesize samples with specified semantics to cover a higher diversity of given classes and interpolated semantics of pairs of classes. We propose a simple yet effective method for applying the augmented semantics to the hinge loss functions to learn a robust mapping. The proposed method was extensively evaluated on small- and large-scale datasets, showing a significant improvement over state-of-the-art methods.

关键词

zero-shot learning; generative adversarial network; image classification; image retrieval

全文

PDF


Model-Free Iterative Temporal Appliance Discovery for Unsupervised Electricity Disaggregation

摘要

Electricity disaggregation identifies individual appliances from one or more aggregate data streams and has immense potential to reduce residential and commercial electrical waste. Since supervised learning methods rely on meticulously labeled training samples that are expensive to obtain, unsupervised methods show the most promise for wide-spread application. However, unsupervised learning methods previously applied to electricity disaggregation suffer from critical limitations. This paper introduces the concept of iterative appliance discovery, a novel unsupervised disaggregation method that progressively identifies the "easiest to find" or "most likely" appliances first. Once these simpler appliances have been identified, the computational complexity of the search space can be significantly reduced, enabling iterative discovery to identify more complex appliances. We test iterative appliance discovery against an existing competitive unsupervised method using two publicly available datasets. Results using different sampling rates show iterative discovery has faster runtimes and produces better accuracy. Furthermore, iterative discovery does not require prior knowledge of appliance characteristics and demonstrates unprecedented scalability to identify long, overlapped sequences that other unsupervised learning algorithms cannot.

关键词

Unsupervised Learning; Machine Learning Applications; Classification; Time-Series/Data Streams; Electricity Disaggregation

全文

PDF


Multimodal Poisson Gamma Belief Network

摘要

To learn a deep generative model of multimodal data, we propose a multimodal Poisson gamma belief network (mPGBN) that tightly couple the data of different modalities at multiple hidden layers. The mPGBN unsupervisedly extracts a nonnegative latent representation using an upward-downward Gibbs sampler. It imposes sparse connections between different layers, making it simple to visualize the generative process and the relationships between the latent features of different modalities. Our experimental results on bi-modal data consisting of images and tags show that the mPGBN can easily impute a missing modality and hence is useful for both image annotation and retrieval. We further demonstrate that the mPGBN achieves state-of-the-art results on unsupervisedly extracting latent features from multimodal data.

关键词

Unsupervised Learning; Graphical Model Learning; Language and Vision

全文

PDF


When Will You Arrive? Estimating Travel Time Based on Deep Neural Networks

摘要

Estimating the travel time of any path (denoted by a sequence of connected road segments) in a city is of great importance to traffic monitoring, route planning, ridesharing, taxi/Uber dispatching, etc. However, it is a very challenging problem, affected by diverse complex factors, including spatial correlations, temporal dependencies, external conditions (e.g. weather, traffic lights). Prior work usually focuses on estimating the travel times of individual road segments or sub-paths and then summing up these times, which leads to an inaccurate estimation because such approaches do not consider road intersections/traffic lights, and local errors may accumulate. To address these issues, we propose an end-to-end Deep learning framework for Travel Time Estimation called DeepTTE that estimates the travel time of the whole path directly. More specifically, we present a geo-convolution operation by integrating the geographic information into the classical convolution, capable of capturing spatial correlations. By stacking recurrent unit on the geo-convoluton layer, our DeepTTE can capture the temporal dependencies simultaneously. A multi-task learning component is given on the top of DeepTTE, that estimates the travel time of both the entire path and each local path simultaneously during the training phase. The extensive experiments on two large-scale datasets shows our DeepTTE significantly outperforms the state-of-the-art methods.

关键词

Spatio-temporal data; Deep learning; Applications

全文

PDF


GraphGAN: Graph Representation Learning With Generative Adversarial Nets

摘要

The goal of graph representation learning is to embed each vertex in a graph into a low-dimensional vector space. Existing graph representation learning methods can be classified into two categories: generative models that learn the underlying connectivity distribution in the graph, and discriminative models that predict the probability of edge existence between a pair of vertices. In this paper, we propose GraphGAN, an innovative graph representation learning framework unifying above two classes of methods, in which the generative model and discriminative model play a game-theoretical minimax game. Specifically, for a given vertex, the generative model tries to fit its underlying true connectivity distribution over all other vertices and produces "fake" samples to fool the discriminative model, while the discriminative model tries to detect whether the sampled vertex is from ground truth or generated by the generative model. With the competition between these two models, both of them can alternately and iteratively boost their performance. Moreover, when considering the implementation of generative model, we propose a novel graph softmax to overcome the limitations of traditional softmax function, which can be proven satisfying desirable properties of normalization, graph structure awareness, and computational efficiency. Through extensive experiments on real-world datasets, we demonstrate that GraphGAN achieves substantial gains in a variety of applications, including link prediction, node classification, and recommendation, over state-of-the-art baselines.

关键词

graph representation learning; generative adversarial nets; graph softmax

全文

PDF


Collaborative Filtering With Social Exposure: A Modular Approach to Social Recommendation

摘要

This paper is concerned with how to make efficient use of social information to improve recommendations. Most existing social recommender systems assume people share similar preferences with their social friends. Which, however, may not hold true due to various motivations of making online friends and dynamics of online social networks. Inspired by recent causal process based recommendations that first model user exposures towards items and then use these exposures to guide rating prediction, we utilize social information to capture user exposures rather than user preferences. We assume that people get information of products from their online friends and they do not have to share similar preferences, which is less restrictive and seems closer to reality. Under this new assumption, in this paper, we present a novel recommendation approach (named SERec) to integrate social exposure into collaborative filtering. We propose two methods to implement SERec, namely social regularization and social boosting, each with different ways to construct social exposures. Experiments on four real-world datasets demonstrate that our methods outperform the state-of-the-art methods on top-N recommendations. Further study compares the robustness and scalability of the two proposed methods.

关键词

recommendation; social exposure

全文

PDF


AJILE Movement Prediction: Multimodal Deep Learning for Natural Human Neural Recordings and Video

摘要

Developing useful interfaces between brains and machines is a grand challenge of neuroengineering. An effective interface has the capacity to not only interpret neural signals, but predict the intentions of the human to perform an action in the near future; prediction is made even more challenging outside well-controlled laboratory experiments. This paper describes our approach to detect and to predict natural human arm movements in the future, a key challenge in brain computer interfacing that has never before been attempted. We introduce the novel Annotated Joints in Long-term ECoG (AJILE) dataset; AJILE includes automatically annotated poses of 7 upper body joints for four human subjects over 670 total hours (more than 72 million frames), along with the corresponding simultaneously acquired intracranial neural recordings. The size and scope of AJILE greatly exceeds all previous datasets with movements and electrocorticography (ECoG), making it possible to take a deep learning approach to movement prediction. We propose a multimodal model that combines deep convolutional neural networks (CNN) with long short-term memory (LSTM) blocks, leveraging both ECoG and video modalities. We demonstrate that our models are able to detect movements and predict future movements up to 800 msec before movement initiation. Further, our multimodal movement prediction models exhibit resilience to simulated ablation of input neural signals. We believe a multimodal approach to natural neural decoding that takes context into account is critical in advancing bioelectronic technologies and human neuroscience.

关键词

deep learning; neuroscience; multimodal; naturalistic;

全文

PDF


Attention-Based Transactional Context Embedding for Next-Item Recommendation

摘要

To recommend the next item to a user in a transactional context is practical yet challenging in applications such as marketing campaigns. Transactional context refers to the items that are observable in a transaction. Most existing transaction based recommender systems (TBRSs) make recommendations by mainly considering recently occurring items instead of all the ones observed in the current context. Moreover, they often assume a rigid order between items within a transaction, which is not always practical. More importantly, a long transaction often contains many items irreverent to the next choice, which tends to overwhelm the influence of a few truly relevant ones. Therefore, we posit that a good TBRS should not only consider all the observed items in the current transaction but also weight them with different relevance to build an attentive context that outputs the proper next item with a high probability. To this end, we design an effective attention based transaction embedding model (ATEM) for context embedding to weight each observed item in a transaction without assuming order. The empirical study on real-world transaction datasets proves that ATEM significantly outperforms the state-of-the-art methods in terms of both accuracy and novelty.

关键词

全文

PDF


Fully Convolutional Network Based Skeletonization for Handwritten Chinese Characters

摘要

Structural analysis of handwritten characters relies heavily on robust skeletonization of strokes, which has not been solved well by previous thinning methods. This paper presents an effective fully convolutional network (FCN) to extract stroke skeletons for handwritten Chinese characters. We combine the holistically-nested architecture with regressive dense upsampling convolution (rDUC) and recently proposed hybrid dilated convolution (HDC) to generate pixel-level prediction for skeleton extraction. We evaluate our method on character images synthesized from the online handwritten dataset CASIA-OLHWDB and achieve higher accuracy of skeleton pixel detection than traditional thinning algorithms. We also conduct skeleton based character recognition experiments using convolutional neural network (CNN) classifiers on offline/online handwritten datasets, and obtained comparable accuracies with recognition on original character images. This implies the skeletonization loses little shape information.

关键词

skeletonization; fully convolutional network; handwritten character recognition

全文

PDF


Directional Label Rectification in Adaptive Graph

摘要

With the explosive growth of multivariate time-series data, failure (event) analysis has gained widespread applications. A primary goal for failure analysis is to identify the fault signature, i.e., the unique feature pattern to distinguish failure events. However, the complex nature of multivariate time-series data brings challenge in the detection of fault signature. Given a time series from a failure event, the fault signature and the onset of failure are not necessarily adjacent, and the interval between the signature and failure is usually unknown. The uncertainty of such interval causes the uncertainty in labeling timestamps, thus makes it inapplicable to directly employ any standard supervised algorithms in signature detection. To address this problem, we present a novel directional label rectification model which identifies the fault-relevant timestamps and features in a simultaneous approach. Different from previous graph-based label propagation models using fixed graph, we propose to learn an adaptive graph which is optimal for the label rectification process. We conduct extensive experiments on both synthetic and real world datasets and illustrate the advantage of our model in both effectiveness and efficiency.

关键词

Multivariate Time Series; Label Rectification; Adaptive Graph

全文

PDF


Hybrid Attentive Answer Selection in CQA With Deep Users Modelling

摘要

In this paper, we propose solutions to advance answer selection in Community Question Answering (CQA). Unlike previous works, we propose a hybrid attention mechanism to model question-answer pairs. Specifically, for each word, we calculate the intra-sentence attention indicating its local importance and the inter-sentence attention implying its importance to the counterpart sentence. The inter-sentence attention is based on the interactions between question-answer pairs, and the combination of these two attention mechanisms enables us to align the most informative parts in question-answer pairs for sentence matching. Additionally, we exploit user information for answer selection due to the fact that users are more likely to provide correct answers in their areas of expertise. We model users from their written answers to alleviate data sparsity problem, and then learn user representations according to the informative parts in sentences that are useful for question-answer matching task. This mean of modelling users can bridge the semantic gap between different users, as similar users may have the same way of wording their answers. The representations of users, questions and answers are learnt in an end-to-end neural network in a mean that best explains the interrelation between question-answer pairs. We validate the proposed model on a public dataset, and demonstrate its advantages over the baselines with thorough experiments.

关键词

Community Question Answering, Deep Users Modelling, Answer Selection

全文

PDF


Modeling Attention and Memory for Auditory Selection in a Cocktail Party Environment

摘要

Developing a computational auditory model to solve the cocktail party problem has long bedeviled scientists, especially for a single microphone recording. Although recent deep learning based frameworks have made significant progress in multi-talker mixed speech separation, most existing deep learning based methods, focusing on separating all the speech channels rather than selectively attending the target speech and ignoring other sounds, may fail to offer a satisfactory solution in a complex auditory scene where the number of input sounds is usually uncertain and even dynamic. In this work, we employ ideas from auditory selective attention of behavioral and cognitive neurosciences and from recent advances of memory-augmented neural networks. Specifically, a unified Auditory Selection framework with Attention and Memory (dubbed ASAM) is proposed. Our ASAM first accumulates the prior knowledge (that is the acoustic feature to one specific speaker) into a life-long memory during the training phase, meanwhile a speech perceptor is trained to extract the temporal acoustic feature and update the memory online when a salient speech is given. Then, the learned memory is utilized to interact with the mixture input to attend and filter the target frequency out from the mixture stream. Finally, the network is trained to minimize the reconstruction error of the attended speech. We evaluate the proposed approach on WSJ0 and THCHS-30 datasets and the experimental results demonstrate that our approach successfully conducts two auditory selection tasks: the top-down task-specific attention (e.g. to follow a conversation with friend) and the bottom-up stimulus-driven attention (e.g. be attracted by a salient speech). Compared with deep clustering based methods, our method conducts competitive advantages especially in a real noise environment (e.g. street junction). Our code is available at https://github.com/jacoxu/ASAM.

关键词

Auditory Selection; Auditory Attention; Speech Separation; Cocktail Party Environment

全文

PDF


Measuring the Popularity of Job Skills in Recruitment Market: A Multi-Criteria Approach

摘要

To cope with the accelerating pace of technological changes, talents are urged to add and refresh their skills for staying in active and gainful employment. This raises a natural question: what are the right skills to learn? Indeed, it is a nontrivial task to measure the popularity of job skills due to the diversified criteria of jobs and the complicated connections within job skills. To that end, in this paper, we propose a data driven approach for modeling the popularity of job skills based on the analysis of large-scale recruitment data. Specifically, we first build a job skill network by exploring a large corpus of job postings. Then, we develop a novel Skill Popularity based Topic Model (SPTM) for modeling the generation of the skill network. In particular, SPTM can integrate different criteria of jobs (e.g., salary levels, company size) as well as the latent connections within skills, thus we can effectively rank the job skills based on their multi-faceted popularity. Extensive experiments on real-world recruitment data validate the effectiveness of SPTM for measuring the popularity of job skills, and also reveal some interesting rules, such as the popular job skills which lead to high-paid employment.

关键词

recruitment

全文

PDF


Learning Generative Neural Networks for 3D Colorization

摘要

Automatic generation of 3D visual content is a fundamental problem that sits at the intersection of visual computing and artificial intelligence. So far, most existing works have focused on geometry synthesis. In contrast, advances in automatic synthesis of color information, which conveys rich semantic information of 3D geometry, remain rather limited. In this paper, we propose to learn a generative model that maps a latent color parameter space to a space of colorizations across a shape collection. The colorizations are diverse on each shape and consistent across the shape collection. We introduce an unsupervised approach for training this generative model and demonstrate its effectiveness across a wide range of categories. The key feature of our approach is that it only requires one colorization per shape in the training data, and utilizes a neural network to propagate the color information of other shapes to train the generative model for each particular shape. This characteristics makes our approach applicable to standard internet shape repositories.

关键词

Generative Modeling; Neural Networks; 3D Convolution; Alternative Minimization

全文

PDF


Deep Multi-View Spatial-Temporal Network for Taxi Demand Prediction

摘要

Taxi demand prediction is an important building block to enabling intelligent transportation systems in a smart city. An accurate prediction model can help the city pre-allocate resources to meet travel demand and to reduce empty taxis on streets which waste energy and worsen the traffic congestion. With the increasing popularity of taxi requesting services such as Uber and Didi Chuxing (in China), we are able to collect large-scale taxi demand data continuously. How to utilize such big data to improve the demand prediction is an interesting and critical real-world problem. Traditional demand prediction methods mostly rely on time series forecasting techniques, which fail to model the complex non-linear spatial and temporal relations. Recent advances in deep learning have shown superior performance on traditionally challenging tasks such as image classification by learning the complex features and correlations from large-scale data. This breakthrough has inspired researchers to explore deep learning techniques on traffic prediction problems. However, existing methods on traffic prediction have only considered spatial relation (e.g., using CNN) or temporal relation (e.g., using LSTM) independently. We propose a Deep Multi-View Spatial-Temporal Network (DMVST-Net) framework to model both spatial and temporal relations. Specifically, our proposed model consists of three views: temporal view (modeling correlations between future demand values with near time points via LSTM), spatial view (modeling local spatial correlation via local CNN), and semantic view (modeling correlations among regions sharing similar temporal patterns). Experiments on large-scale real taxi demand data demonstrate effectiveness of our approach over state-of-the-art methods.

关键词

demand prediction;deep neural network;multi-view

全文

PDF


WalkRanker: A Unified Pairwise Ranking Model With Multiple Relations for Item Recommendation

摘要

Top-N item recommendation techniques, e.g., pairwise models, learn the rank of users' preferred items through separating items into positive samples if user-item interactions exist, and negative samples otherwise. This separation results in an important issue: the extreme imbalance between positive and negative samples, because the number of items with user actions is much less than those without actions. The problem is even worse for "cold-start" users. In addition, existing learning models only consider the observed user-item proximity, while neglecting other useful relations, such as the unobserved but potentially helpful user-item relations, and high-order proximity in user-user, item-item relations. In this paper, we aim at incorporating multiple types of user-item relations into a unified pairwise ranking model towards approximately optimizing ranking metrics mean average precision (MAP), and mean reciprocal rank (MRR). Instead of taking statical separation of positive and negative sets, we employ a random walk approach to dynamically draw positive samples from short random walk sequences, and a rank-aware negative sampling method to draw negative samples for efficiently learning the proposed pairwise ranking model. The proposed method is compared with several state-of-the-art baselines on two large and sparse datasets. Experimental results show that our proposed model outperforms the other baselines with average 4% at different top-N metrics, in particular for cold-start users with 6% on average.

关键词

one-class collaborative filtering; pairwise ranking; top-N recommendation

全文

PDF


Sequence-to-Point Learning With Neural Networks for Non-Intrusive Load Monitoring

摘要

Energy disaggregation (a.k.a nonintrusive load monitoring, NILM), a single-channel blind source separation problem, aims to decompose the mains which records the whole house electricity consumption into appliance-wise readings. This problem is difficult because it is inherently unidentifiable. Recent approaches have shown that the identifiability problem could be reduced by introducing domain knowledge into the model. Deep neural networks have been shown to be a promising approach for these problems, but sliding windows are necessary to handle the long sequences which arise in signal processing problems, which raises issues about how to combine predictions from different sliding windows. In this paper, we propose sequence-to-point learning, where the input is a window of the mains and the output is a single point of the target appliance. We use convolutional neural networks to train the model. Interestingly, we systematically show that the convolutional neural networks can inherently learn the signatures of the target appliances, which are automatically added into the model to reduce the identifiability problem. We applied the proposed neural network approaches to real-world household energy data, and show that the methods achieve state-of-the-art performance, improving two standard error measures by 84% and 92%.

关键词

MLA: Applications of Supervised Learning; ML: Deep Learning/Neural Networks; NILM; Disagregation; Single Channel Blind Source Separation

全文

PDF


Feature Enhancement Network: A Refined Scene Text Detector

摘要

In this paper, we propose a refined scene text detector with a novel Feature Enhancement Network (FEN)for Region Proposal and Text Detection Refinement. Retrospectively, both region proposal with only 3 x 3 sliding-window feature and text detection refinement with single scale high level feature are insufficient, especially for smaller scene text. Therefore, we design a new FEN network with task-specific, low and high level semantic features fusion to improve the performance of text detection. Besides, since unitary position-sensitive RoI pooling in general object detection is unreasonable for variable text regions, an adaptively weighted position-sensitive RoI pooling layer is devised for further enhancing the detecting accuracy. To tackle the sample-imbalance problem during the refinement stage,we also propose an effective positives mining strategy for efficiently training our network. Experiments on ICDAR2011 and 2013 robust text detection benchmarks demonstrate that our method can achieve state-of-the-art results, outperforming all reported methods in terms of F-measure.

关键词

Feature Enhancement Network, Text Detection, Region Proposal, adaptively weighted position-sensitive RoI pooling, positives mining

全文

PDF


COSINE: Community-Preserving Social Network Embedding From Information Diffusion Cascades

摘要

This paper studies the problem of social network embedding without relying on network structures that are usually not observed in many cases. We address that the information diffusion process across networks naturally reflects rich proximity relationships between users. Meanwhile, social networks contain multiple communities regularizing communication pathways for information propagation. Based on the above observations, we propose a probabilistic generative model, called COSINE, to learn community-preserving social network embeddings from the recurrent and time-stamped social contagion logs, namely information diffusion cascades. The learned embeddings therefore capture the high-order user proximities in social networks. Leveraging COSINE, we are able to discover underlying social communities and predict temporal dynamics of social contagion. Experimental results on both synthetic and real-world datasets show that our proposed model significantly outperforms the existing approaches.

关键词

全文

PDF


Data Poisoning Attacks on Multi-Task Relationship Learning

摘要

Multi-task learning (MTL) is a machine learning paradigm that improves the performance of each task by exploiting useful information contained in multiple related tasks. However, the relatedness of tasks can be exploited by attackers to launch data poisoning attacks, which has been demonstrated a big threat to single-task learning. In this paper, we provide the first study on the vulnerability of MTL. Specifically, we focus on multi-task relationship learning (MTRL) models, a popular subclass of MTL models where task relationships are quantized and are learned directly from training data. We formulate the problem of computing optimal poisoning attacks on MTRL as a bilevel program that is adaptive to arbitrary choice of target tasks and attacking tasks. We propose an efficient algorithm called PATOM for computing optimal attack strategies. PATOM leverages the optimality conditions of the subproblem of MTRL to compute the implicit gradients of the upper level objective function. Experimental results on real-world datasets show that MTRL models are very sensitive to poisoning attacks and the attacker can significantly degrade the performance of target tasks, by either directly poisoning the target tasks or indirectly poisoning the related tasks exploiting the task relatedness. We also found that the tasks being attacked are always strongly correlated, which provides a clue for defending against such attacks.

关键词

data poisoning; multi-task learning

全文

PDF


An Adversarial Hierarchical Hidden Markov Model for Human Pose Modeling and Generation

摘要

We propose a hierarchical extension to hidden Markov model (HMM) under the Bayesian framework to overcome its limited model capacity. The model parameters are treated as random variables whose distributions are governed by hyperparameters. Therefore the variation in data can be modeled at both instance level and distribution level. We derive a novel learning method for estimating the parameters and hyperparameters of our model based on adversarial learning framework, which has shown promising results in generating photorealistic images and videos. We demonstrate the benefit of the proposed method on human motion capture data through comparison with both state-of-the-art methods and the same model that is learned by maximizing likelihood. The first experiment on reconstruction shows the model's capability of generalizing to novel testing data. The second experiment on synthesis shows the model's capability of generating realistic and diverse data.

关键词

Generative Model; Probabilistic Dynamic Model; Adversarial Learning; Data Synthesis

全文

PDF


Unsupervised Representation Learning With Long-Term Dynamics for Skeleton Based Action Recognition

摘要

In recent years, skeleton based action recognition is becoming an increasingly attractive alternative to existing video-based approaches, beneficial from its robust and comprehensive 3D information. In this paper, we explore an unsupervised representation learning approach for the first time to capture the long-term global motion dynamics in skeleton sequences. We design a conditional skeleton inpainting architecture for learning a fixed-dimensional representation, guided by additional adversarial training strategies. We quantitatively evaluate the effectiveness of our learning approach on three well-established action recognition datasets. Experimental results show that our learned representation is discriminative for classifying actions and can substantially reduce the sequence inpainting errors.

关键词

Unsupervised Learning; Action Recognition; RNN; GAN

全文

PDF


SFCN-OPI: Detection and Fine-Grained Classification of Nuclei Using Sibling FCN With Objectness Prior Interaction

摘要

Cell nuclei detection and fine-grained classification have been fundamental yet challenging problems in histopathology image analysis. Due to the nuclei tiny size, significant inter-/intra-class variances, as well as the inferior image quality, previous automated methods would easily suffer from limited accuracy and robustness. In the meanwhile, existing approaches usually deal with these two tasks independently, which would neglect the close relatedness of them. In this paper, we present a novel method of sibling fully convolutional network with prior objectness interaction (called SFCN-OPI) to tackle the two tasks simultaneously and interactively using a unified end-to-end framework. Specifically, the sibling FCN branches share features in earlier layers while holding respective higher layers for specific tasks. More importantly, the detection branch outputs the objectness prior which dynamically interacts with the fine-grained classification sibling branch during the training and testing processes. With this mechanism, the fine-grained classification successfully focuses on regions with high confidence of nuclei existence and outputs the conditional probability, which in turn benefits the detection through back propagation. Extensive experiments on colon cancer histology images have validated the effectiveness of our proposed SFCN-OPI and our method has outperformed the state-of-the-art methods by a large margin.

关键词

Histopathology Image Analysis; Objectness Prior Interaction; Multi-task Learning

全文

PDF


Parameter-Free Centralized Multi-Task Learning for Characterizing Developmental Sex Differences in Resting State Functional Connectivity

摘要

In contrast to most existing studies that typically characterize the developmental sex differences using analysis of variance or equivalently multiple linear regression, we present a parameter-free centralized multi-task learning method to identify sex specific and common resting state functional connectivity (RSFC) patterns underlying the brain development based on resting state functional MRI (rs-fMRI) data. Specifically, we design a novel multi-task learning model to characterize sex specific and common RSFC patterns in an age prediction framework by regarding the age prediction for males and females as separate tasks. Moreover, the importance of each task and the balance of these two patterns, respectively, are automatically learned in order to make the multi-task learning robust as well as free of tunable parameters, i.e., parameter-free for short. Our experimental results on synthetic datasets verified the effectiveness of our method with respect to prediction performance, and experimental results on rs-fMRI scans of 1041 subjects (651 males) of the Philadelphia Neurodevelopmental Cohort (PNC) showed that our method could improve the age prediction on average by 5.82% with statistical significance than the best alternative methods under comparison, in addition to characterizing the developmental sex differences in RSFC patterns.

关键词

全文

PDF


Safe Reinforcement Learning via Shielding

摘要

Reinforcement learning algorithms discover policies that maximize reward, but do not necessarily guarantee safety during learning or execution phases. We introduce a new approach to learn optimal policies while enforcing properties expressed in temporal logic. To this end, given the temporal logic specification that is to be obeyed by the learning system, we propose to synthesize a reactive system called a shield. The shield monitors the actions from the learner and corrects them only if the chosen action causes a violation of the specification. We discuss which requirements a shield must meet to preserve the convergence guarantees of the learner. Finally, we demonstrate the versatility of our approach on several challenging reinforcement learning scenarios.

关键词

Reinforcement Learning; Formal Methods

全文

PDF


Sample-Efficient Learning of Mixtures

摘要

We consider PAC learning of probability distributions (a.k.a. density estimation), where we are given an i.i.d. sample generated from an unknown target distribution, and want to output a distribution that is close to the target in total variation distance. Let F be an arbitrary class of probability distributions, and let Fk denote the class of k-mixtures of elements of F. Assuming the existence of a method for learning F with sample complexity m(ε), we provide a method for learning Fk with sample complexity O((k.log k .m(ε))/(ε2)). Our mixture learning algorithm has the property that, if the F-learner is proper and agnostic, then the Fk-learner would be proper and agnostic as well. This general result enables us to improve the best known sample complexity upper bounds for a variety of important mixture classes. First, we show that the class of mixtures of k axis-aligned Gaussians in Rd is PAC-learnable in the agnostic setting with O((kd)/(ε4)) samples, which is tight in k and d up to logarithmic factors. Second, we show that the class of mixtures of k Gaussians in Rd is PAC-learnable in the agnostic setting with sample complexity Õ((kd2)/(ε4)), which improves the previous known bounds of Õ((k3.d2)/(ε4)) and Õ(k4.d4/ε2) in its dependence on k and d. Finally, we show that the class of mixtures of k log-concave distributions over Rd is PAC-learnable using Õ(k.d((d+5)/2)ε(-(d+9)/2)) samples.

关键词

mixture models; distribution learning; Gaussian;sample complexity; sample efficient; mixture of Gaussians

全文

PDF


Learning to Attack: Adversarial Transformation Networks

摘要

With the rapidly increasing popularity of deep neural networks for image recognition tasks, a parallel interest in generating adversarial examples to attack the trained models has arisen. To date, these approaches have involved either directly computing gradients with respect to the image pixels or directly solving an optimization on the image pixels. We generalize this pursuit in a novel direction: can a separate network be trained to efficiently attack another fully trained network? We demonstrate that it is possible, and that the generated attacks yield startling insights into the weaknesses of the target network. We call such a network an Adversarial Transformation Network (ATN). ATNs transform any input into an adversarial attack on the target network, while being minimally perturbing to the original inputs and the target network's outputs. Further, we show that ATNs are capable of not only causing the target network to make an error, but can be constructed to explicitly control the type of misclassification made. We demonstrate ATNs on both simple MNIST-digit classifiers and state-of-the-art ImageNet classifiers deployed by Google, Inc.: Inception ResNet-v2.

关键词

Deep Neural Networks, Adversarial Training, ImageNet, Machine Learning

全文

PDF


Online Learning for Structured Loss Spaces

摘要

We consider prediction with expert advice when the loss vectors are assumed to lie in a set described by the sum of atomic norm balls. We derive a regret bound for a general version of the online mirror descent (OMD) algorithm that uses a combination of regularizers, each adapted to the constituent atomic norms. The general result recovers standard OMD regret bounds, and yields regret bounds for new structured settings where the loss vectors are (i) noisy versions of vectors from a low-dimensional subspace, (ii) sparse vectors corrupted with noise, and (iii) sparse perturbations of low-rank vectors. For the problem of online learning with structured losses, we also show lower bounds on regret in terms of rank and sparsity of the loss vectors, which implies lower bounds for the above additive loss settings as well.

关键词

Online Learning; Structured Losses; Online Mirror Descent; OMD; Atomic Norm; Strong Convexity

全文

PDF


ARC: Adversarial Robust Cuts for Semi-Supervised and Multi-Label Classification

摘要

Many structured prediction tasks arising in computer vision and natural language processing tractably reduce to making minimum cost cuts in graphs with edge weights learned using maximum margin methods. Unfortunately, the hinge loss used to construct these methods often provides a particularly loose bound on the loss function of interest (e.g., the Hamming loss). We develop Adversarial Robust Cuts (ARC), an approach that poses the learning task as a minimax game between predictor and "label approximator" based on minimum cost graph cuts. Unlike maximum margin methods, this game-theoretic perspective always provides meaningful bounds on the Hamming loss. We conduct multi-label and semi-supervised binary prediction experiments that demonstrate the benefits of our approach.

关键词

Adversarial structured prediction; Graph cut; multi-label prediction;semi-supervised classification

全文

PDF


Estimating the Class Prior in Positive and Unlabeled Data Through Decision Tree Induction

摘要

For tasks such as medical diagnosis and knowledge base completion, a classifier may only have access to positive and unlabeled examples, where the unlabeled data consists of both positive and negative examples. One way that enables learning from this type of data is knowing the true class prior. In this paper, we propose a simple yet effective method for estimating the class prior, by estimating the probability that a positive example is selected to be labeled. Our key insight is that subdomains of the data give a lower bound on this probability. This lower bound gets closer to the real probability as the ratio of labeled examples increases. Finding such subsets can naturally be done via top-down decision tree induction. Experiments show that our method makes estimates which are equivalently accurate as those of the state of the art methods, and is an order of magnitude faster.

关键词

Positive Unlabeled Class Prior

全文

PDF


Long-Term Image Boundary Prediction

摘要

Boundary estimation in images and videos has been a very active topic of research, and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception. While prior work has focused on estimating boundaries for observed frames, our work aims at predicting boundaries of future unobserved frames. This requires our model to learn about the fate of boundaries and corresponding motion patterns---including a notion of "intuitive physics." We experiment on natural video sequences along with synthetic sequences with deterministic physics-based and agent-based motions. While not being our primary goal, we also show that fusion of RGB and boundary prediction leads to improved RGB predictions.

关键词

Vision;Image Boundaries;Prediction

全文

PDF


Algorithms for Generalized Topic Modeling

摘要

Recently there has been significant activity in developing algorithms with provable guarantees for topic modeling. In this work we consider a broad generalization of the traditional topic modeling framework, where we no longer assume that words are drawn i.i.d. and instead view a topic as a complex distribution over sequences of paragraphs. Since one could not hope to even represent such a distribution in general (even if paragraphs are given using some natural feature representation), we aim instead to directly learn a predictor that given a new document, accurately predicts its topic mixture, without learning the distributions explicitly. We present several natural conditions under which one can do this from unlabeled data only, and give efficient algorithms to do so, also discussing issues such as noise tolerance and sample complexity. More generally, our model can be viewed as a generalization of the multi-view or co-training setting in machine learning.

关键词

Topic Modeling; Cotraining; Multi-view learning

全文

PDF


Bayesian Robust Attributed Graph Clustering: Joint Learning of Partial Anomalies and Group Structure

摘要

We study the problem of robust attributed graph clustering. In real data, the clustering structure is often obfuscated due to anomalies or corruptions. While robust methods have been recently introduced that handle anomalies as part of the clustering process, they all fail to account for one core aspect: Since attributed graphs consist of two views (network structure and attributes) anomalies might materialize only partially, i.e. instances might be corrupted in one view but perfectly fit in the other. In this case, we can still derive meaningful cluster assignments. Existing works only consider complete anomalies. In this paper, we present a novel probabilistic generative model (PAICAN) that explicitly models partial anomalies by generalizing ideas of Degree Corrected Stochastic Block Models and Bernoulli Mixture Models. We provide a highly scalable variational inference approach with runtime complexity linear in the number of edges. The robustness of our model w.r.t. anomalies is demonstrated by our experimental study, outperforming state-of-the-art competitors.

关键词

robust clustering; attributed graphs; partial anomalies; stochastic block models; variational inference

全文

PDF


Trace Ratio Optimization With Feature Correlation Mining for Multiclass Discriminant Analysis

摘要

Fisher's linear discriminant analysis is a widely accepted dimensionality reduction method, which aims to find a transformation matrix to convert feature space to a smaller space by maximising the between-class scatter matrix while minimising the within-class scatter matrix. Although the fast and easy process of finding the transformation matrix has made this method attractive, overemphasizing the large class distances makes the criterion of this method suboptimal. In this case, the close class pairs tend to overlap in the subspace. Despite different weighting methods having been developed to overcome this problem, there is still a room to improve this issue. In this work, we study a weighted trace ratio by maximising the harmonic mean of the multiple objective reciprocals. To further improve the performance, we enforce the l2,1-norm to the developed objective function. Additionally, we propose an iterative algorithm to optimise this objective function. The proposed method avoids the domination problem of the largest objective, and guarantees that no objectives will be too small. This method can be more beneficial if the number of classes is large. The extensive experiments on different datasets show the effectiveness of our proposed method when compared with four state-of-the-art methods.

关键词

Trace Ratio; L21 regularisation; Feature Extraction

全文

PDF


Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

摘要

In the field of reinforcement learning there has been recent progress towards safety and high-confidence bounds on policy performance. However, to our knowledge, no practical methods exist for determining high-confidence policy performance bounds in the inverse reinforcement learning setting---where the true reward function is unknown and only samples of expert behavior are given. We propose a sampling method based on Bayesian inverse reinforcement learning that uses demonstrations to determine practical high-confidence upper bounds on the alpha-worst-case difference in expected return between any evaluation policy and the optimal policy under the expert's unknown reward function. We evaluate our proposed bound on both a standard grid navigation task and a simulated driving task and achieve tighter and more accurate bounds than a feature count-based baseline. We also give examples of how our proposed bound can be utilized to perform risk-aware policy selection and risk-aware policy improvement. Because our proposed bound requires several orders of magnitude fewer demonstrations than existing high-confidence bounds, it is the first practical method that allows agents that learn from demonstration to express confidence in the quality of their learned policy.

关键词

inverse reinforcement learning; learning from demonstration; high-confidence performance bounds; risk-sensitive

全文

PDF


Teaching a Machine to Read Maps With Deep Reinforcement Learning

摘要

The ability to use a 2D map to navigate a complex 3D environment is quite remarkable, and even difficult for many humans. Localization and navigation is also an important problem in domains such as robotics, and has recently become a focus of the deep reinforcement learning community. In this paper we teach a reinforcement learning agent to read a map in order to find the shortest way out of a random maze it has never seen before. Our system combines several state-of-the-art methods such as A3C and incorporates novel elements such as a recurrent localization cell. Our agent learns to localize itself based on 3D first person images and an approximate orientation angle. The agent generalizes well to bigger mazes, showing that it learned useful localization and navigation capabilities.

关键词

Deep Reinforcement Learning, Navigation, Localization, DeepMind Lab

全文

PDF


Graph Scan Statistics With Uncertainty

摘要

Scan statistics is one of the most popular approaches for anomaly detection in spatial and network data. In practice, there are numerous sources of uncertainty in the observed data. However, most prior works have overlooked such uncertainty, which can affect the accuracy and inferences of such methods. In this paper, we develop the first systematic approach to incorporating uncertainty in scan statistics. We study two formulations for robust scan statistics, one based on the sample average approximation and the other using a max-min objective. We show that uncertainty significantly increases the computational complexity of these problems. Rigorous algorithms and efficient heuristics for both formulations are developed with justification of theoretical bounds. We evaluate our proposed methods on synthetic and real datasets, and we observe that our methods give significant improvement in the detection power as well as optimization objective, relative to a baseline.

关键词

scan statistics; graph anomaly detection; uncertainty

全文

PDF


Mining Heavy Temporal Subgraphs: Fast Algorithms and Applications

摘要

Anomaly detection is a fundamental problem in dynamic networks. In this paper, we study an approach for identifying anomalous subgraphs based on the Heaviest Dynamic Subgraph (HDS) problem. The HDS in a time-evolving edge-weighted graph consists of a pair containing a subgraph and subinterval whose sum of edge weights is maximized. The HDS problem in a static graph is equivalent to the Prize Collecting Steiner Tree (PCST) problem with the Net-Worth objective---this is a very challenging problem, in general, and numerous heuristics have been proposed. Prior methods for the HDS problem use the PCST solution as a heuristic, and run in time quadratic in the size of the graph. As a result, they do not scale well to large instances. In this paper, we develop a new approach for the HDS problem, which combines rigorous algorithmic and practical techniques and has much better scalability. Our algorithm is able to extend to other variations of the HDS problem, such as the problem of finding multiple anomalous regions. We evaluate our algorithms in a diverse set of real and synthetic networks, and we find solutions with higher score and better detection power for anomalous events compared to earlier heuristics.

关键词

graph anomaly detection;temporal graph;steiner tree

全文

PDF


Efficient Architecture Search by Network Transformation

摘要

Techniques for automatically designing deep neural network architectures such as reinforcement learning based approaches have recently shown promising results. However, their success is based on vast computational resources (e.g. hundreds of GPUs), making them difficult to be widely used. A noticeable limitation is that they still design and train each network from scratch during the exploration of the architecture space, which is highly inefficient. In this paper, we propose a new framework toward efficient architecture search by exploring the architecture space based on the current network and reusing its weights. We employ a reinforcement learning agent as the meta-controller, whose action is to grow the network depth or layer width with function-preserving transformations. As such, the previously validated networks can be reused for further exploration, thus saves a large amount of computational cost. We apply our method to explore the architecture space of the plain convolutional neural networks (no skip-connections, branching etc.) on image benchmark datasets (CIFAR-10, SVHN) with restricted computational resources (5 GPUs). Our method can design highly competitive networks that outperform existing networks using the same design scheme. On CIFAR-10, our model without skip-connections achieves 4.23% test error rate, exceeding a vast majority of modern architectures and approaching DenseNet. Furthermore, by applying our method to explore the DenseNet architecture space, we are able to achieve more accurate networks with fewer parameters.

关键词

Automatic Architecture Search; Deep Neural Networks; Reinforcement Learning

全文

PDF


Unsupervised Domain Adaptation With Distribution Matching Machines

摘要

Domain adaptation generalizes a learning model across source domain and target domain that follow different distributions. Most existing work follows a two-step procedure: first, explores either feature matching or instance reweighting independently, and second, train the transfer classifier separately. In this paper, we show that either feature matching or instance reweighting can only reduce, but not remove, the cross-domain discrepancy, and the knowledge hidden in the relations between the data labels from the source and target domains is important for unsupervised domain adaptation. We propose a new Distribution Matching Machine (DMM) based on the structural risk minimization principle, which learns a transfer support vector machine by extracting invariant feature representations and estimating unbiased instance weights that jointly minimize the cross-domain distribution discrepancy. This leads to a robust transfer learner that performs well against both mismatched features and irrelevant instances. Our theoretical analysis proves that the proposed approach further reduces the generalization error bound of related domain adaptation methods. Comprehensive experiments validate that the DMM approach significantly outperforms competitive methods on standard domain adaptation benchmarks.

关键词

Transfer learning; Domain adaptation

全文

PDF


Link Prediction via Subgraph Embedding-Based Convex Matrix Completion

摘要

Link prediction is of fundamental importance in network science and machine learning. Early methods consider only simple topological features, while subsequent supervised approaches typically rely on human-labeled data and feature engineering. In this work, we present a new representation learning-based approach called SEMAC that jointly exploits fine-grained node features as well as the overall graph topology. In contrast to the SGNS or SVD methods espoused in previous representation-based studies, our model represents nodes in terms of subgraph embeddings acquired via a form of convex matrix completion to iteratively reduce the rank, and thereby, more effectively eliminate noise in the representation. Thus, subgraph embeddings and convex matrix completion are elegantly integrated into a novel link prediction framework. Experimental results on several datasets show the effectiveness of our method compared to previous work.

关键词

全文

PDF


Reversible Architectures for Arbitrarily Deep Residual Neural Networks

摘要

Recently, deep residual networks have been successfully applied in many computer vision and natural language processing tasks, pushing the state-of-the-art performance with deeper and wider architectures. In this work, we interpret deep residual networks as ordinary differential equations (ODEs), which have long been studied in mathematics and physics with rich theoretical and empirical success. From this interpretation, we develop a theoretical framework on stability and reversibility of deep neural networks, and derive three reversible neural network architectures that can go arbitrarily deep in theory. The reversibility property allows a memory-efficient implementation, which does not need to store the activations for most hidden layers. Together with the stability of our architectures, this enables training deeper networks using only modest computational resources. We provide both theoretical analyses and empirical results. Experimental results demonstrate the efficacy of our architectures against several strong baselines on CIFAR-10, CIFAR-100 and STL-10 with superior or on-par state-of-the-art performance. Furthermore, we show our architectures yield superior results when trained using fewer training data.

关键词

全文

PDF


Gated-Attention Architectures for Task-Oriented Language Grounding

摘要

To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior linguistic or perceptual knowledge and requires only raw pixels from the environment and the natural language instruction as input. The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation learning methods. We show the effectiveness of the proposed model on unseen instructions as well as unseen maps, both quantitatively and qualitatively. We also introduce a novel environment based on a 3D game engine to simulate the challenges of task-oriented language grounding over a rich set of instructions and environment states.

关键词

Machine Learning; Deep Reinforcement Learning; Gated-Attention; Language Grounding

全文

PDF


AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training

摘要

Highly distributed training of Deep Neural Networks (DNNs) on future compute platforms (offering 100 of TeraOps/s of computational capacity) is expected to be severely communication constrained. To overcome this limitation, new gradient compression techniques are needed that are computationally friendly, applicable to a wide variety of layers seen in Deep Neural Networks and adaptable to variations in network architectures as well as their hyper-parameters. In this paper we introduce a novel technique - the Adaptive Residual Gradient Compression (AdaComp) scheme. AdaComp is based on localized selection of gradient residues and automatically tunes the compression rate depending on local activity. We show excellent results on a wide spectrum of state of the art Deep Learning models in multiple domains (vision, speech, language), datasets (MNIST, CIFAR10, ImageNet, BN50, Shakespeare), optimizers (SGD with momentum, Adam) and network parameters (number of learners, minibatch-size etc.). Exploiting both sparsity and quantization, we demonstrate end-to-end compression rates of ∼200× for fully-connected and recurrent layers, and ∼40× for convolutional layers, without any noticeable degradation in model accuracies.

关键词

deep learning; distributed training; data parallelism; compression; algorithm; gradients

全文

PDF


LSTD: A Low-Shot Transfer Detector for Object Detection

摘要

Recent advances in object detection are mainly driven by deep learning with large-scale detection benchmarks. However, the fully-annotated training set is often limited for a target detection task, which may deteriorate the performance of deep detectors. To address this challenge, we propose a novel low-shot transfer detector (LSTD) in this paper, where we leverage rich source-domain knowledge to construct an effective target-domain detector with very few training examples. The main contributions are described as follows. First, we design a flexible deep architecture of LSTD to alleviate transfer difficulties in low-shot detection. This architecture can integrate the advantages of both SSD and Faster RCNN in a unified deep framework. Second, we introduce a novel regularized transfer learning framework for low-shot detection, where the transfer knowledge (TK) and background depression (BD) regularizations are proposed to leverage object knowledge respectively from source and target domains, in order to further enhance fine-tuning with a few target images. Finally, we examine our LSTD on a number of challenging low-shot detection experiments, where LSTD outperforms other state-of-the-art approaches. The results demonstrate that LSTD is a preferable deep detector for low-shot scenarios.

关键词

Object Detection; Low-Shot Learning; Transfer Learning

全文

PDF


Automatic Segmentation of Data Sequences

摘要

Segmenting temporal data sequences is an important problem which helps in understanding data dynamics in multiple applications such as epidemic surveillance, motion capture sequences, etc. In this paper, we give DASSA, the first self-guided and efficient algorithm to automatically find a segmentation that best detects the change of pattern in data sequences. To avoid introducing tuning parameters, we design DASSA to be a multi-level method which examines segments at each level of granularity via a compact data structure called the segment-graph. We build this data structure by carefully leveraging the information bottleneck method with the MDL principle to effectively represent each segment.Next, DASSA efficiently finds the optimal segmentation via a novel average-longest-path optimization on the segment-graph. Finally we show how the outputs from DASSA can be naturally interpreted to reveal meaningful patterns. We ran DASSA on multiple real datasets of varying sizes and it is very effective in finding the time-cut points of the segmentations (in some cases recovering the cut points perfectly) as well as in finding the corresponding changing patterns.

关键词

全文

PDF


DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer

摘要

We have witnessed rapid evolution of deep neural network architecture design in the past years. These latest progresses greatly facilitate the developments in various areas such as computer vision and natural language processing. However, along with the extraordinary performance, these state-of-the-art models also bring in expensive computational cost. Directly deploying these models into applications with real-time requirement is still infeasible. Recently, Hinton et al. have shown that the dark knowledge within a powerful teacher model can significantly help the training of a smaller and faster student network. These knowledge are vastly beneficial to improve the generalization ability of the student model. Inspired by their work, we introduce a new type of knowledge---cross sample similarities for model compression and acceleration. This knowledge can be naturally derived from deep metric learning model. To transfer them, we bring the "learning to rank" technique into deep metric learning formulation. We test our proposed DarkRank method on various metric learning tasks including pedestrian re-identification, image retrieval and image clustering. The results are quite encouraging. Our method can improve over the baseline method by a large margin. Moreover, it is fully compatible with other existing methods. When combined, the performance can be further boosted.

关键词

Knowledge distillation; Model acceleration; Metric learning

全文

PDF


Automatic Parameter Tying: A New Approach for Regularized Parameter Learning in Markov Networks

摘要

Parameter tying is a regularization method in which parameters (weights) of a machine learning model are partitioned into groups by leveraging prior knowledge and all parameters in each group are constrained to take the same value. In this paper, we consider the problem of parameter learning in Markov networks and propose a novel approach called automatic parameter tying (APT) that uses automatic instead of a priori and soft instead of hard parameter tying as a regularization method to alleviate overfitting. The key idea behind APT is to set up the learning problem as the task of finding parameters and groupings of parameters such that the likelihood plus a regularization term is maximized. The regularization term penalizes models where parameter values deviate from their group mean parameter value. We propose and use a block coordinate ascent algorithm to solve the optimization task. We analyze the sample complexity of our new learning algorithm and show that it yields optimal parameters with high probability when the groups are well separated. Experimentally, we show that our method improves upon L2 regularization and suggest several pragmatic techniques for good practical performance.

关键词

Markov networks; parameter learning; regularization

全文

PDF


Expected Policy Gradients

摘要

We propose expected policy gradients (EPG), which unify stochastic policy gradients (SPG) and deterministic policy gradients (DPG) for reinforcement learning. Inspired by expected sarsa, EPG integrates across the action when estimating the gradient, instead of relying only on the action in the sampled trajectory. We establish a new general policy gradient theorem, of which the stochastic and deterministic policy gradient theorems are special cases. We also prove that EPG reduces the variance of the gradient estimates without requiring deterministic policies and, for the Gaussian case, with no computational overhead. Finally, we show that it is optimal in a certain sense to explore with a Gaussian policy such that the covariance is proportional to the exponential of the scaled Hessian of the critic with respect to the actions. We present empirical results confirming that this new form of exploration substantially outperforms DPG with the Ornstein-Uhlenbeck heuristic in four challenging MuJoCo domains.

关键词

Reinforcement Learning; MDPs; Actor-Critic; Policy Gradients

全文

PDF


Diverse Exploration for Fast and Safe Policy Improvement

摘要

We study an important yet under-addressed problem of quickly and safely improving policies in online reinforcement learning domains. As its solution, we propose a novel exploration strategy - diverse exploration (DE), which learns and deploys a diverse set of safe policies to explore the environment. We provide DE theory explaining why diversity in behavior policies enables effective exploration without sacrificing exploitation. Our empirical study shows that an online policy improvement algorithm framework implementing the DE strategy can achieve both fast policy improvement and safe online performance.

关键词

Reinforcement Learning

全文

PDF


Clustering Small Samples With Quality Guarantees: Adaptivity With One2all PPS

摘要

Clustering of data points is a fundamental tool in data analysis. We consider points X in a relaxed metric space, where the triangle inequality holds within a constant factor. A clustering of X is a partition of X defined by a set of points Q(centroids), according to the closest centroid. The cost of clustering X by Q is V(Q)= ∑x ∈ X dxQ. This formulation generalizes classic k-means clustering, which uses squared distances. Two basic tasks, parametrized by k ≥ 1, are cost estimation, which returns (approximate) V(Q) for queries Q such that |Q| = k and clustering, which returns an (approximate) minimizer of V(Q) of size |Q|= k. When the data set X is very large, we seek efficient constructions of small samples that can act as surrogates for performing these tasks. Existing constructions that provide quality guarantees, however, are either worst-case, and unable to benefit from structure of real data sets, or make explicit strong assumptions on the structure. We show here how to avoid both these pitfalls using adaptive designs. The core of our design are the novel one2all probabilities, computed for a set M of centroids and α ≥ 1:  The clustering cost of  each Q with cost V(Q) ≥ V(M)/α can be estimated well from a sample of size O(α |M| ε-2). For cost estimation, we apply one2all with a bicriteria approximate M, while adaptively balancing |M| and α to optimize sample size per quality. For clustering, we present a wrapper that adaptively applies a base clustering algorithm to a sample S, using the smallest sample that provides the desired statistical guarantees on quality. We demonstrate experimentally the huge gains of using our adaptive instead of worst-case methods.

关键词

clustering; multi-objective sampling; adaptive algorithms

全文

PDF


Distributional Reinforcement Learning With Quantile Regression

摘要

In reinforcement learning (RL), an agent interacts with the environment by taking actions and observing the next state and reward. When sampled probabilistically, these state transitions, rewards, and actions can all induce randomness in the observed long-term return. Traditionally, reinforcement learning algorithms average over this randomness to estimate the value function. In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean. That is, we examine methods of learning the value distribution instead of the value function. We give results that close a number of gaps between the theoretical and algorithmic results given by Bellemare, Dabney, and Munos (2017). First, we extend existing results to the approximate distribution setting. Second, we present a novel distributional reinforcement learning algorithm consistent with our theoretical formulation. Finally, we evaluate this new algorithm on the Atari 2600 games, observing that it significantly outperforms many of the recent improvements on DQN, including the related distributional algorithm C51.

关键词

reinforcement learning; distributional reinforcement learning; quantile regression; temporal difference learning

全文

PDF


Multi-Step Reinforcement Learning: A Unifying Algorithm

摘要

Unifying seemingly disparate algorithmic ideas to produce better performing algorithms has been a longstanding goal in reinforcement learning. As a primary example, TD(λ) elegantly unifies one-step TD prediction with Monte Carlo methods through the use of eligibility traces and the trace-decay parameter. Currently, there are a multitude of algorithms that can be used to perform TD control, including Sarsa, Q-learning, and Expected Sarsa. These methods are often studied in the one-step case, but they can be extended across multiple time steps to achieve better performance. Each of these algorithms is seemingly distinct, and no one dominates the others for all problems. In this paper, we study a new multi-step action-value algorithm called Q(σ) that unifies and generalizes these existing algorithms, while subsuming them as special cases. A new parameter, σ, is introduced to allow the degree of sampling performed by the algorithm at each step during its backup to be continuously varied, with Sarsa existing at one extreme (full sampling), and Expected Sarsa existing at the other (pure expectation). Q(σ) is generally applicable to both on- and off-policy learning, but in this work we focus on experiments in the on-policy case. Our results show that an intermediate value of σ, which results in a mixture of the existing algorithms, performs better than either extreme. The mixture can also be varied dynamically which can result in even greater performance.

关键词

Reinforcement Learning; On-Policy; Markov Decision Processes

全文

PDF


Randomized Kernel Selection With Spectra of Multilevel Circulant Matrices

摘要

Kernel selection aims at choosing an appropriate kernel function for kernel-based learning algorithms to avoid either underfitting or overfitting of the resulting hypothesis. One of the main problems faced by kernel selection is the evaluation of the goodness of a kernel, which is typically difficult and computationally expensive. In this paper, we propose a randomized kernel selection approach to evaluate and select the kernel with the spectra of the specifically designed multilevel circulant matrices (MCMs), which is statistically sound and computationally efficient. Instead of constructing the kernel matrix, we construct the randomized MCM to encode the kernel function and all data points together with labels. We build a one-to-one correspondence between all candidate kernel functions and the spectra of the randomized MCMs by Fourier transform. We prove the statistical properties of the randomized MCMs and the randomized kernel selection criteria, which theoretically qualify the utility of the randomized criteria in kernel selection. With the spectra of the randomized MCMs, we derive a series of randomized criteria to conduct kernel selection, which can be computed in log-linear time and linear space complexity by fast Fourier transform (FFT). Experimental results demonstrate that our randomized kernel selection criteria are significantly more efficient than the existing classic and widely-used criteria while preserving similar predictive performance.

关键词

Kernel Selection; Multilevel Circulant Matrices; Spectra; Large-Scale Kernel Methods

全文

PDF


Coupled Poisson Factorization Integrated With User/Item Metadata for Modeling Popular and Sparse Ratings in Scalable Recommendation

摘要

Modelling sparse and large data sets is highly in demand yet challenging in recommender systems. With the computation only on the non-zero ratings, Poisson Factorization (PF) enabled by variational inference has shown its high efficiency in scalable recommendation, e.g., modeling millions of ratings. However, as PF learns the ratings by individual users on items with the Gamma distribution, it cannot capture the coupling relations between users (items) and the rating popularity (i.e., favorable rating scores that are given to one item) and rating sparsity (i.e., those users (items) with many zero ratings) for one item (user). This work proposes Coupled Poisson Factorization (CPF) to learn the couplings between users (items), and the user/item attributes (i.e., metadata) are integrated into CPF to form the Metadata-integrated CPF (mCPF) to not only handle sparse but also popular ratings in very large-scale data. Our empirical results show that the proposed models significantly outperform PF and address the key limitations in PF for scalable recommendation.

关键词

全文

PDF


Learning From Semi-Supervised Weak-Label Data

摘要

Multi-label learning deals with data objects associated with multiple labels simultaneously. Previous studies typically assume that for each instance, the full set of relevant labels associated with each training instance is given. In many applicationssuch as image annotation, however, it’s usually difficult to get the full label set for each instance and only a partial or even empty set of relevant labels is available. We call this kind of problem as "semi-supervised weak-label learning" problem. In this work we propose the SSWL (Semi-Supervised Weak-Label) method to address this problem. Both instance similarity and label similarity are considered for the complement of missing labels. Ensemble of multiple models are utilized to improve the robustness when label information is insufficient. We formulate the objective as a bi-convex optimization problem with an efficient block coordinate descent algorithm. Experiments validate the effectiveness of SSWL.

关键词

全文

PDF


Decomposition Strategies for Constructive Preference Elicitation

摘要

We tackle the problem of constructive preference elicitation, that is the problem of learning user preferences over verylarge decision problems, involving a combinatorial space of possible outcomes. In this setting, the suggested configuration is synthesized on-the-fly by solving a constrained optimization problem, while the preferences are learned iteratively by interacting with the user. Previous work has shown that Coactive Learning is a suitable method for learning userpreferences in constructive scenarios. In Coactive Learning the user provides feedback to the algorithm in the form of an improvement to a suggested configuration. When the problem involves many decision variables and constraints, this type of interaction poses a significant cognitive burden on the user. We propose a decomposition technique for large preference-based decision problems relying exclusively on inference and feedback over partial configurations. This has the clear advantage of drastically reducing the user cognitive load. Additionally, part-wise inference can be (up to exponentially) less computationally demanding than inference over full configurations. We discuss the theoretical implications of working with parts and present promising empirical results on one synthetic and two realistic constructive problems.

关键词

machine learning; structured output prediction; preference elicitation

全文

PDF


Constructive Preference Elicitation Over Hybrid Combinatorial Spaces

摘要

Peference elicitation is the task of suggesting a highly preferred configuration to a decision maker. The preferences are typically learned by querying the user for choice feedback over pairs or sets of objects. In its constructive variant, new objects are synthesized "from scratch" by maximizing an estimate of the user utility over a combinatorial (possibly infinite) space of candidates. In the constructive setting, most existing elicitation techniques fail because they rely on exhaustive enumeration of the candidates. A previous solution explicitly designed for constructive tasks comes with no formal performance guarantees, and can be very expensive in (or unapplicable to) problems with non-Boolean attributes. We propose the Choice Perceptron, a Perceptron-like algorithm for learning user preferences from set-wise choice feedback over constructive domains and hybrid Boolean-numeric feature spaces. We provide a theoretical analysis on the attained regret that holds for a large class of query selection strategies, and devise a heuristic strategy that aims at optimizing the regret in practice. Finally, we demonstrate its effectiveness by empirical evaluation against existing competitors on constructive scenarios of increasing complexity.

关键词

machine learning, structured output prediction, preference elicitation

全文

PDF


Learning to Rank Based on Analogical Reasoning

摘要

Object ranking or "learning to rank" is an important problem in the realm of preference learning. On the basis of training data in the form of a set of rankings of objects represented as feature vectors, the goal is to learn a ranking function that predicts a linear order of any new set of objects. In this paper, we propose a new approach to object ranking based on principles of analogical reasoning. More specifically, our inference pattern is formalized in terms of so-called analogical proportions and can be summarized as follows: Given objects A,B,C,D, if object A is known to be preferred to B, and C relates to D as A relates to B, then C is (supposedly) preferred to D. Our method applies this pattern as a main building block and combines it with ideas and techniques from instance-based learning and rank aggregation. Based on first experimental results for data sets from various domains (sports, education, tourism, etc.), we conclude that our approach is highly competitive. It appears to be specifically interesting in situations in which the objects are coming from different subdomains, and which hence require a kind of knowledge transfer.

关键词

Preference Learning, Object Ranking, Analogical Reasoning, Pairwise Preferences

全文

PDF


Learning Lexicographic Preference Trees From Positive Examples

摘要

This paper considers the task of learning the preferences of users on a combinatorial set of alternatives, as it can be the case for example with online configurators. In many settings, what is available to the learner is a set of positive examples of alternatives that have been selected during past interactions. We propose to learn a model of the users' preferences that ranks previously chosen alternatives as high as possible. In this paper, we study the particular task of learning conditional lexicographic preferences. We present an algorithm to learn several classes of lexicographic preference trees, prove convergence properties of the algorithm, and experiment on both synthetic data and on a real-world bench in the domain of recommendation in interactive configuration.

关键词

全文

PDF


AutoEncoder by Forest

摘要

Auto-encoding is an important task which is typically realized by deep neural networks (DNNs) such as convolutional neural networks (CNN). In this paper, we propose EncoderForest (abbrv. eForest), the first tree ensemble based auto-encoder. We present a procedure for enabling forests to do backward reconstruction by utilizing the Maximal-Compatible Rule (MCR) defined by the decision paths of the trees, and demonstrate its usage in both supervised and unsupervised setting. Experiments show that, compared with DNN based auto-encoders, eForest is able to obtain lower reconstruction error with fast training speed, while the model itself is reusable and damage-tolerable.

关键词

全文

PDF


Counterfactual Multi-Agent Policy Gradients

摘要

Many real-world problems, such as network packet routing and the coordination of autonomous vehicles, are naturally modelled as cooperative multi-agent systems. There is a great need for new reinforcement learning methods that can efficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies. In addition, to address the challenges of multi-agent credit assignment, it uses a counterfactual baseline that marginalises out a single agent's action, while keeping the other agents' actions fixed. COMA also uses a critic representation that allows the counterfactual baseline to be computed efficiently in a single forward pass. We evaluate COMA in the testbed of StarCraft unit micromanagement, using a decentralised variant with significant partial observability. COMA significantly improves average performance over other multi-agent actor-critic methods in this setting, and the best performing agents are competitive with state-of-the-art centralised controllers that get access to the full state.

关键词

deep reinforcement learning; multi-­agent learning; actor­critic

全文

PDF


Lagrangian Constrained Community Detection

摘要

Semi-supervised or constrained community detection incorporates side information to findcommunities of interest in complex networks. The supervision is often represented as constraints such as known labels and pairwise constraints. Existing constrained community detection approaches often fail to fully benefit from the available side information. This results in poor performance for scenarios such as: when the constraints are required to be fully satisfied, when there is a high confidence about the correctness of the supervision information, and in situations where the side information is expensive or hard to achieve and is only available in a limited amount. In this paper, we propose a new constrained community detection algorithm based on Lagrangian multipliers to incorporate and fully satisfy the instance level supervisio nconstraints. Our proposed algorithm can more fully utilise available side information and find better quality solutions. Our experiments on real and synthetic data sets show our proposed LagCCD algorithm outperforms existing algorithms in terms of solution quality, ability to satisfy the constraints and noise resistance.

关键词

Semi-supervised Learning; Community Detection, Lagrange Multipliers Method

全文

PDF


DID: Distributed Incremental Block Coordinate Descent for Nonnegative Matrix Factorization

摘要

Nonnegative matrix factorization (NMF) has attracted much attention in the last decade as a dimension reduction method in many applications. Due to the explosion in the size of data, naturally the samples are collected and stored distributively in local computational nodes. Thus, there is a growing need to develop algorithms in a distributed memory architecture. We propose a novel distributed algorithm, called distributed incremental block coordinate descent (DID), to solve the problem. By adapting the block coordinate descent framework, closed-form update rules are obtained in DID. Moreover, DID performs updates incrementally based on the most recently updated residual matrix. As a result, only one communication step per iteration is required. The correctness, efficiency, and scalability of the proposed algorithm are verified in a series of numerical experiments.

关键词

Nonnegative Matrix Factorization, Clustering, Dimension Reduction

全文

PDF


Incomplete Label Multi-Task Ordinal Regression for Spatial Event Scale Forecasting

摘要

Event scales are commonly used by practitioners to gauge subjective feelings on the magnitude and significance of social events. For example, the Centers for Disease Control and Prevention (CDC) utilizes a 10-level scale to distinguish the severity of flu outbreaks and governments typically categorize violent outbreaks based on their intensity as reflected in multiple aspects. Effective forecasting of future event scales can be used qualitatively to determine reasonable resource allocations and facilitate accurate proactive actions by practitioners. Existing spatial event forecasting methods typically focus on the occurrence of events rather than their ordinal event scales as this is very challenging in several respects, including 1) the ordinal nature of the event scale, 2) the spatial heterogeneity of event scaling in different geo-locations, 3) the incompleteness of scale label data for some spatial locations, and 4) the spatial correlation of event scale patterns. In order to address all these challenges concurrently, a MultI-Task Ordinal Regression (MITOR) framework is proposed to effectively forecast the scale of future events. Our model enforces similar feature sparsity patterns for different tasks while preserving the heterogeneity in their scale patterns. In addition, based on the first law of geography, we proposed to enforce spatially-closed tasks to share similar scale patterns with theoretical guarantees. Optimizing the proposed model amounts to a new non-convex and non-smooth problem with an isotonicity constraint, which is then solved by our new algorithm based on ADMM and dynamic programming. Extensive experiments on ten real-world datasets demonstrate the effectiveness and efficiency of the proposed model.

关键词

Multitask learning; Ordinal regression; Spital event forecasting

全文

PDF


Learning Combinatory Categorial Grammars for Plan Recognition

摘要

This paper defines a learning algorithm for plan grammars used for plan recognition. The algorithm learns Combinatory Categorial Grammars (CCGs) that capture the structure of plans from a set of successful plan execution traces paired with the goal of the actions. This work is motivated by past work on CCG learning algorithms for natural language processing, and is evaluated on five well know planning domains.

关键词

CCG; Plan Learning

全文

PDF


Characterization of the Convex Łukasiewicz Fragment for Learning From Constraints

摘要

This paper provides a theoretical insight for the integration of logical constraints into a learning process. In particular it is proved that a fragment of the Łukasiewicz logic yields a set of convex constraints. The fragment is enough expressive to include many formulas of interest such as Horn clauses. Using the isomorphism of Łukasiewicz formulas and McNaughton functions, logical constraints are mapped to a set of linear constraints once the predicates are grounded on a given sample set. In this framework, it is shown how a collective classification scheme can be formulated as a quadratic programming problem, but the presented theory can be exploited in general to embed logical constraints into a learning process. The proposed approach is evaluated on a classification task to show how the use of the logical rules can be effective to improve the accuracy of a trained classifier.

关键词

Learning from constraints; Collective Classification; First-order logic; Quadratic programming

全文

PDF


Topic Modeling on Health Journals With Regularized Variational Inference

摘要

Topic modeling enables exploration and compact representation of a corpus. The CaringBridge (CB) dataset is a massive collection of journals written by patients and caregivers during a health crisis. Topic modeling on the CB dataset, however, is challenging due to the asynchronous nature of multiple authors writing about their health journeys. To overcome this challenge we introduce the Dynamic Author-Persona topic model (DAP), a probabilistic graphical model designed for temporal corpora with multiple authors. The novelty of the DAP model lies in its representation of authors by a persona---where personas capture the propensity to write about certain topics over time. Further, we present a regularized variational inference (RVI) algorithm, which we use to encourage the DAP model's personas to be distinct. Our results show significant improvements over competing topic models---particularly after regularization, and highlight the DAP model's unique ability to capture common journeys shared by different authors.

关键词

machine learning, topic modeling, graphical model, regularized variational inference, healthcare

全文

PDF


Non-Discriminatory Machine Learning Through Convex Fairness Criteria

摘要

Cross-domain data reconstruction methods derive a shared transformation across source and target domains. These methods usually make a specific assumption on noise, which exhibits limited ability when the target data are contaminated by different kinds of complex noise in practice. To enhance the robustness of domain adaptation under severe noise conditions, this paper proposes a novel reconstruction based algorithm in an information-theoretic setting. Specifically, benefiting from the theoretical property of correntropy, the proposed algorithm is distinguished with: detecting the contaminated target samples without making any specific assumption on noise; greatly suppressing the negative influence of noise on cross-domain transformation. Moreover, a relative entropy based regularization of the transformation is incorporated to avoid trivial solutions with the reaped theoretic advantages, i.e., non-negativity and scale-invariance. For optimization, a half-quadratic technique is developed to minimize the non-convex information-theoretic objectives with explicitly guaranteed convergence. Experiments on two real-world domain adaptation tasks demonstrate the superiority of our method.

关键词

machine learning; non-discrimination; fairness

全文

PDF


Margin Based PU Learning

摘要

The PU learning problem concerns about learning from positive and unlabeled data. A popular heuristic is to iteratively enlarge training set based on some margin-based criterion. However, little theoretical analysis has been conducted to support the success of these heuristic methods. In this work, we show that not all margin-based heuristic rules are able to improve the learned classifiers iteratively. We find that a so-called large positive margin oracle is necessary to guarantee the success of PU learning. Under this oracle, a provable positive-margin based PU learning algorithm is proposed for linear regression and classification under the truncated Gaussian distributions. The proposed algorithm is able to reduce the recovering error geometrically proportional to the positive margin. Extensive experiments on real-world datasets verify our theory and the state-of-the-art performance of the proposed PU learning algorithm.

关键词

PU Learning; Generalization error; Classification

全文

PDF


A Continuous Relaxation of Beam Search for End-to-End Training of Neural Sequence Models

摘要

Beam search is a desirable choice of test-time decoding algorithm for neural sequence models because it potentially avoids search errors made by simpler greedy methods. However, typical cross entropy training procedures for these models do not directly consider the behaviour of the final decoding method. As a result, for cross-entropy trained models, beam decoding can sometimes yield reduced test performance when compared with greedy decoding. In order to train models that can more effectively make use of beam search, we propose a new training procedure that focuses on the final loss metric (e.g. Hamming loss) evaluated on the output of beam search. While well-defined, this "direct loss" objective is itself discontinuous and thus difficult to optimize. Hence, in our approach, we form a sub-differentiable surrogate objective by introducing a novel continuous approximation of the beam search decoding procedure.In experiments, we show that optimizing this new training objective yields substantially better results on two sequence tasks (Named Entity Recognition and CCG Supertagging) when compared with both cross entropy trained greedy decoding and cross entropy trained beam decoding baselines.

关键词

Beam Search; Continuous Relaxation; Neural Sequence models; seq2seq models

全文

PDF


Human Guided Linear Regression With Feature-Level Constraints

摘要

Linear regression methods are commonly used by both researchers and data scientists due to their interpretability and their reduced likelihood of overfitting. However, these methods can still perform poorly if little labeled training data is available. Typical methods used to overcome a lack of labeled training data somehow involve exploiting an outside source of labeled data or large amounts of unlabeled data. This includes areas such as active learning, semi-supervised learning and transfer learning, but in many domains these approaches are not always applicable because they require either a mechanism to label data, large amounts of unlabeled data or additional sources of sufficiently related data. In this paper we explore an alternative, non-data centric approach. We allow the user to guide the learning system through three forms of feature-level guidance which constrain the parameters of the regression function. Such guidance is unlikely to be perfectly accurate, so we derive methods which are robust to some amounts of noise, a property we formally prove for one of our methods.

关键词

Linear Regression; Human Guidance; Feature Constraints

全文

PDF


Learning Predictive State Representations From Non-Uniform Sampling

摘要

Predictive state representations (PSR) have emerged as a powerful method for modelling partially observable environments. PSR learning algorithms can build models for predicting all observable variables, or predicting only some of them conditioned on others (e.g., actions or exogenous variables). In the latter case, which we call conditional modelling, the accuracy of different estimates of the conditional probabilities for a fixed dataset can vary significantly, due to the limited sampling of certain conditions. This can have negative consequences on the PSR parameter estimation process, which are not taken into account by the current state-of-the-art PSR spectral learning algorithms. In this paper, we examine closely conditional modelling within the PSR framework. We first establish a new positive but surprisingly non-trivial result: a conditional model can never be larger than the complete model. Then, we address the core shortcoming of existing PSR spectral learning methods for conditional models by incorporating an additional step in the process, which can be seen as a type of matrix denoising. We further refine this objective by adding penalty terms for violations of the system dynamics matrix structure, which improves the PSR predictive performance. Empirical evaluations on both synthetic and real datasets highlight the advantages of the proposed approach.

关键词

Sequential Data Modelling; Reinforcement Learning

全文

PDF


Flow-GAN: Combining Maximum Likelihood and Adversarial Learning in Generative Models

摘要

Adversarial learning of probabilistic models has recently emerged as a promising alternative to maximum likelihood. Implicit models such as generative adversarial networks (GAN) often generate better samples compared to explicit models trained by maximum likelihood. Yet, GANs sidestep the characterization of an explicit density which makes quantitative evaluations challenging. To bridge this gap, we propose Flow-GANs, a generative adversarial network for which we can perform exact likelihood evaluation, thus supporting both adversarial and maximum likelihood training. When trained adversarially, Flow-GANs generate high-quality samples but attain extremely poor log-likelihood scores, inferior even to a mixture model memorizing the training data; the opposite is true when trained by maximum likelihood. Results on MNIST and CIFAR-10 demonstrate that hybrid training can attain high held-out likelihoods while retaining visual fidelity in the generated samples.

关键词

generative models; adversarial learning; unsupervised learning; maximum likelihood estimation; normalizing flow models

全文

PDF


Boosted Generative Models

摘要

We propose a novel approach for using unsupervised boosting to create an ensemble of generative models, where models are trained in sequence to correct earlier mistakes. Our meta-algorithmic framework can leverage any existing base learner that permits likelihood evaluation, including recent deep expressive models. Further, our approach allows the ensemble to include discriminative models trained to distinguish real data from model-generated data. We show theoretical conditions under which incorporating a new model in the ensemble will improve the fit and empirically demonstrate the effectiveness of our black-box boosting algorithms on density estimation, classification, and sample generation on benchmark datasets for a wide range of generative models.

关键词

boosting; generative models; unsupervised learning

全文

PDF


Asynchronous Doubly Stochastic Sparse Kernel Learning

摘要

Kernel methods have achieved tremendous success in the past two decades. In the current big data era, data collection has grown tremendously. However, existing kernel methods are not scalable enough both at the training and predicting steps. To address this challenge, in this paper, we first introduce a general sparse kernel learning formulation based on the random feature approximation, where the loss functions are possibly non-convex. Then we propose a new asynchronous parallel doubly stochastic algorithm for large scale sparse kernel learning (AsyDSSKL). To the best our knowledge, AsyDSSKL is the first algorithm with the techniques of asynchronous parallel computation and doubly stochastic optimization. We also provide a comprehensive convergence guarantee to AsyDSSKL. Importantly, the experimental results on various large-scale real-world datasets show that, our AsyDSSKL method has the significant superiority on the computational efficiency at the training and predicting steps over the existing kernel methods.

关键词

Kernel Learning; Sparse Kernel; Asynchronous Doubly Stochastic Algorithm

全文

PDF


Inexact Proximal Gradient Methods for Non-Convex and Non-Smooth Optimization

摘要

In machine learning research, the proximal gradient methods are popular for solving various optimization problems with non-smooth regularization. Inexact proximal gradient methods are extremely important when exactly solving the proximal operator is time-consuming, or the proximal operator does not have an analytic solution. However, existing inexact proximal gradient methods only consider convex problems. The knowledge of inexact proximal gradient methods in the non-convex setting is very limited. To address this challenge, in this paper, we first propose three inexact proximal gradient algorithms, including the basic version and Nesterov’s accelerated version. After that, we provide the theoretical analysis to the basic and Nesterov’s accelerated versions. The theoretical results show that our inexact proximal gradient algorithms can have the same convergence rates as the ones of exact proximal gradient algorithms in the non-convex setting. Finally, we show the applications of our inexact proximal gradient algorithms on three representative non-convex learning problems. Empirical results confirm the superiority of our new inexact proximal gradient algorithms.

关键词

Inexact Proximal Gradient; Non-Convex and Non-Smooth Optimization

全文

PDF


An Euclidean Distance Based on Tensor Product Graph Diffusion Related Attribute Value Embedding for Nominal Data Clustering

摘要

Not like numerical data clustering, nominal data clustering is a very difficult problem because there exists no natural relative ordering between nominal attribute values. This paper mainly aims to make the Euclidean distance measure appropriate to nominal data clustering, and the core idea is the attribute value embedding, namely, transforming each nominal attribute value into a numerical vector. This embedding method consists of four steps. In the first step, the weights, which can quantify the amount of information in attribute values, is calculated for each value in each nominal attribute based on each object and its k nearest neighbors. In the second step, an intra-attribute value similarity matrix is created for each nominal attribute by using the attribute value's weights. In the third step, for each nominal attribute, we find another attribute with the maximal dependence on it, and build an inter-attribute value similarity matrix on the basis of the attribute value's weights related to these two attributes. In the last step, a diffusion matrix of each nominal attribute is constructed by the tensor product graph diffusion process, and this step can cause the acquired value embedding to contain simultaneously the intra- and inter-attribute value similarities information. To evaluate the effectiveness of our proposed method, experiments are done on 10 data sets. Experimental results demonstrate that our method not only enables the Euclidean distance to be used for nominal data clustering, but also can acquire the better clustering performance than several existing state-of-the-art approaches.

关键词

Nominal data;Clustering

全文

PDF


Who Said What: Modeling Individual Labelers Improves Classification

摘要

Data are often labeled by many different experts with each expert only labeling a small fraction of the data and each data point being labeled by several experts. This reduces the workload on individual experts and also gives a better estimate of the unobserved ground truth. When experts disagree, the standard approaches are to treat the majority opinion as the correct label or to model the correct label as a distribution. These approaches, however, do not make any use of potentially valuable information about which expert produced which label. To make use of this extra information, we propose modeling the experts individually and then learning averaging weights for combining them, possibly in sample-specific ways. This allows us to give more weight to more reliable experts and take advantage of the unique strengths of individual experts at classifying certain types of data. Here we show that our approach leads to improvements in computer-aided diagnosis of diabetic retinopathy. We also show that our method performs better than competing algorithms by Welinder and Perona (2010); Mnih and Hinton (2012). Our work offers an innovative approach for dealing with the myriad real-world settings that use expert opinions to define labels for training.

关键词

Deep Learning/Neural Networks, Classification

全文

PDF


Nonparametric Stochastic Contextual Bandits

摘要

We analyze the K-armed bandit problem where the reward for each arm is a noisy realization based on an observed context under mild nonparametric assumptions.We attain tight results for top-arm identification and a sublinear regret of Õ(T1+D/(2+D), where D is the context dimension, for a modified UCB algorithm that is simple to implement. We then give global intrinsic dimension dependent and ambient dimension independent regret bounds. We also discuss recovering topological structures within the context space based on expected bandit performance and provide an extension to infinite-armed contextual bandits. Finally, we experimentally show the improvement of our algorithm over existing approaches for both simulated tasks and MNIST image classification.

关键词

Online Learning; Learning Theory; Dimensionality Reduction/Feature Selection

全文

PDF


A General Formulation for Safely Exploiting Weakly Supervised Data

摘要

Weakly supervised data helps improve learning performance, which is an important machine learning data. However, recent results indicate that machine learning techniques with the usage of weakly supervised data may sometimes lead to performance degradation. How to safely leverage weakly supervised data has become an important issue, whereas there is only very limited effort, especially on a general formulation to help provide insight to understand safe weakly supervised learning. In this paper we present a scheme, which builds the final prediction results by integrating several weakly supervised learners. Our resultant formulation brings two implications. i) It has safeness guarantees for the commonly used convex loss functions in both regression and classification tasks of weakly supervised learning; ii) It can embed uncertain prior knowledge about the importance of base learners flexibly. Moreover, our formulation can be addressed globally by simple convex quadratic program or linear program in an efficient manner. Experiments on multiple weakly supervised learning tasks such as label noise learning, domain adaptation and semi-supervised learning validate the effectiveness of our proposed algorithms.

关键词

weakly supervised learning

全文

PDF


Double Forward Propagation for Memorized Batch Normalization

摘要

Batch Normalization (BN) has been a standard component in designing deep neural networks (DNNs). Although the standard BN can significantly accelerate the training of DNNs and improve the generalization performance, it has several underlying limitations which may hamper the performance in both training and inference. In the training stage, BN relies on estimating the mean and variance of data using a single mini-batch. Consequently, BN can be unstable when the batch size is very small or the data is poorly sampled. In the inference stage, BN often uses the so called moving mean and moving variance instead of batch statistics, i.e., the training and inference rules in BN are not consistent. Regarding these issues, we propose a memorized batch normalization (MBN), which considers multiple recent batches to obtain more accurate and robust statistics. Note that after the SGD update for each batch, the model parameters will change, and the features will change accordingly, leading to the Distribution Shift before and after the update for the considered batch. To alleviate this issue, we present a simple Double-Forward scheme in MBN which can further improve the performance. Compared to related methods, the proposed MBN exhibits consistent behaviors in both training and inference. Empirical results show that the MBN based models trained with the Double-Forward scheme greatly reduce the sensitivity of data and significantly improve the generalization performance.

关键词

Neural networks; Batch normalization; Training method

全文

PDF


Learning Across Scales---Multiscale Methods for Convolution Neural Networks

摘要

In this work, we establish the relation between optimal control and training deep Convolution Neural Networks (CNNs). We show that the forward propagation in CNNs can be interpreted as a time-dependent nonlinear differential equation and learning can be seen as controlling the parameters of the differential equation such that the network approximates the data-label relation for given training data. Using this continuous interpretation, we derive two new methods to scale CNNs with respect to two different dimensions. The first class of multiscale methods connects low-resolution and high-resolution data using prolongation and restriction of CNN parameters inspired by algebraic multigrid techniques. We demonstrate that our method enables classifying high-resolution images using CNNs trained with low-resolution images and vice versa and warm-starting the learning process. The second class of multiscale methods connects shallow and deep networks and leads to new training strategies that gradually increase the depths of the CNN while re-using parameters for initializations.

关键词

Deep Learning, Convolution Neural Networks, Optimal Control, Partial Differential Equations

全文

PDF


A Framework for Multistream Regression With Direct Density Ratio Estimation

摘要

Regression over a stream of data is challenging due to unbounded data size and non-stationary distribution over time. Typically, a traditional supervised regression model over a data stream is trained on data instances occurring within a short time period by assuming a stationary distribution. This model is later used to predict value of response-variable in future instances. Over time, the model may degrade in performance due to changes in data distribution among incoming data instances. Updating the model for change adaptation requires true value for every recent data instances, which is scarce in practice. To overcome this issue, recent studies have employed techniques that sample fewer instances to be used for model retraining. Yet, this may introduce sampling bias that adversely affects the model performance. In this paper, we study the regression problem over data streams in a novel setting. We consider two independent, yet related, non-stationary data streams, which are referred to as the source and the target stream. The target stream continuously generates data instances whose value of response variable is unknown. The source stream, however, continuously generates data instances along with corresponding value for the response-variable, and has a biased data distribution with respect to the target stream. We refer to the problem of using a model trained on the biased source stream to predict the response-variable’s value in data instances occurring on the target stream as Multistream Regression. In this paper, we describe a framework for multistream regression that simultaneously overcomes distribution bias and detects change in data distribution represented by the two streams over time using a Gaussian kernel model. We analyze the theoretical properties of the proposed approach and empirically evaluate it on both real-world and synthetic data sets. Importantly, our results indicate superior performance by the framework compared to other baseline regression methods.

关键词

data stream; regression; density ratio estimation

全文

PDF


Approximate and Exact Enumeration of Rule Models

摘要

In machine learning, rule models are one of the most popular choices when model interpretability is the primary concern. Ordinary, a single model is obtained by solving an optimization problem, and the resulting model is interpreted as the one that best explains the data. In this study, instead of finding a single rule model, we propose algorithms for enumerating multiple rule models. Model enumeration is useful in practice when (i) users want to choose a model that is particularly suited to their task knowledge, or (ii) users want to obtain several possible mechanisms that could be underlying the data to use as hypotheses for further scientific studies. To this end, we propose two enumeration algorithms: an approximate algorithm and an exact algorithm. We prove that these algorithms can enumerate models in a descending order of their objective function values approximately and exactly. We then confirm our theoretical results through experiments on real-world data. We also show that, by using the proposed enumeration algorithms, we can find several different models of almost equal quality.

关键词

全文

PDF


When Waiting Is Not an Option: Learning Options With a Deliberation Cost

摘要

Recent work has shown that temporally extended actions (options) can be learned fully end-to-end as opposed to being specified in advance. While the problem of how to learn options is increasingly well understood, the question of what good options should be has remained elusive. We formulate our answer to what good options should be in the bounded rationality framework (Simon, 1957) through the notion of deliberation cost. We then derive practical gradient-based learning algorithms to implement this objective. Our results in the Arcade Learning Environment (ALE) show increased performance and interpretability.

关键词

reinforcement learning; bounded rationality; temporal abstraction

全文

PDF


Learning With Options That Terminate Off-Policy

摘要

A temporally abstract action, or an option, is specified by a policy and a termination condition: the policy guides the option behavior, and the termination condition roughly determines its length. Generally, learning with longer options (like learning with multi-step returns) is known to be more efficient. However, if the option set for the task is not ideal, and cannot express the primitive optimal policy well, shorter options offer more flexibility and can yield a better solution. Thus, the termination condition puts learning efficiency at odds with solution quality. We propose to resolve this dilemma by decoupling the behavior and target terminations, just like it is done with policies in off-policy learning. To this end, we give a new algorithm, Q(beta), that learns the solution with respect to any termination condition, regardless of how the options actually terminate. We derive Q(beta) by casting learning with options into a common framework with well-studied multi-step off policy learning. We validate our algorithm empirically, and show that it holds up to its motivating claims.

关键词

Reinforcement Learning

全文

PDF


Reinforced Multi-Label Image Classification by Exploring Curriculum

摘要

Humans and animals learn much better when the examples are not randomly presented but organized in a meaningful order which illustrates gradually more concepts, and gradually more complex ones. Inspired by this curriculum learning mechanism, we propose a reinforced multi-label image classification approach imitating human behavior to label image from easy to complex. This approach allows a reinforcement learning agent to sequentially predict labels by fully exploiting image feature and previously predicted labels. The agent discovers the optimal policies through maximizing the long-term reward which reflects prediction accuracies. Experimental results on PASCAL VOC2007 and 2012 demonstrate the necessity of reinforcement multi-label learning and the algorithm’s effectiveness in real-world multi-label image classification tasks.

关键词

全文

PDF


An Efficient, Expressive and Local Minima-Free Method for Learning Controlled Dynamical Systems

摘要

We propose a framework for modeling and estimating the state of controlled dynamical systems, where an agent can affect the system through actions and receives partial observations. Based on this framework, we propose Predictive State Representation with Random Fourier Features (RFF-PSR). A key property in RFF-PSRs is that the state estimate is represented by a conditional distribution of future observations given future actions. RFFPSRs combine this representation with moment-matching, kernel embedding, and local optimization to achieve a method that enjoys several favorable qualities: It can represent controlled environments which can be affected by actions, it has an efficient and theoretically justified learning algorithm, it uses a non-parametric representation that has expressive power to represent continuous non-linear dynamics. We provide a detailed formulation, a theoretical analysis and an experimental evaluation that demonstrates the effectiveness of our method.

关键词

System Identification; Predictive State Representations;

全文

PDF


OptionGAN: Learning Joint Reward-Policy Options Using Generative Adversarial Inverse Reinforcement Learning

摘要

Reinforcement learning has shown promise in learning policies that can solve complex problems. However, manually specifying a good reward function can be difficult, especially for intricate tasks. Inverse reinforcement learning offers a useful paradigm to learn the underlying reward function directly from expert demonstrations. Yet in reality, the corpus of demonstrations may contain trajectories arising from a diverse set of underlying reward functions rather than a single one. Thus, in inverse reinforcement learning, it is useful to consider such a decomposition. The options framework in reinforcement learning is specifically designed to decompose policies in a similar light. We therefore extend the options framework and propose a method to simultaneously recover reward options in addition to policy options. We leverage adversarial methods to learn joint reward-policy options using only observed expert states. We show that this approach works well in both simple and complex continuous control tasks and shows significant performance increases in one-shot transfer learning.

关键词

Reinforcement Learning; Applications of Reinforcement Learning; Transfer, Adaptation, Multitask Learning

全文

PDF


Deep Reinforcement Learning That Matters

摘要

In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL). Reproducing existing work and accurately judging the improvements offered by novel methods is vital to sustaining this progress. Unfortunately, reproducing results for state-of-the-art deep RL methods is seldom straightforward. In particular, non-determinism in standard benchmark environments, combined with variance intrinsic to the methods, can make reported results tough to interpret. Without significance metrics and tighter standardization of experimental reporting, it is difficult to determine whether improvements over the prior state-of-the-art are meaningful. In this paper, we investigate challenges posed by reproducibility, proper experimental techniques, and reporting procedures. We illustrate the variability in reported metrics and results when comparing against common baselines and suggest guidelines to make future results in deep RL more reproducible. We aim to spur discussion about how to ensure continued progress in the field by minimizing wasted effort stemming from results that are non-reproducible and easily misinterpreted.

关键词

Reinforcement Learning; Applications of Reinforcement Learning; Machine Learning

全文

PDF


Rainbow: Combining Improvements in Deep Reinforcement Learning

摘要

The deep reinforcement learning community has made several independent improvements to the DQN algorithm. However, it is unclear which of these extensions are complementary and can be fruitfully combined. This paper examines six extensions to the DQN algorithm and empirically studies their combination. Our experiments show that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance. We also provide results from a detailed ablation study that shows the contribution of each component to overall performance.

关键词

deep reinforcement learning;

全文

PDF


Deep Q-learning From Demonstrations

摘要

Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor. This may be acceptable for a simulator, but it severely limits the applicability of deep RL to many real-world tasks, where the agent must learn in the real environment. In this paper we study a setting where the agent may access data from previous control of the system. We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstration data while learning thanks to a prioritized replay mechanism. DQfD works by combining temporal difference updates with supervised classification of the demonstrator’s actions. We show that DQfD has better initial performance than Prioritized Dueling Double Deep Q-Networks (PDD DQN) as it starts with better scores on the first million steps on 41 of 42 games and on average it takes PDD DQN 83 million steps to catch up to DQfD’s performance. DQfD learns to out-perform the best demonstration given in 14 of 42 games. In addition, DQfD leverages human demonstrations to achieve state-of-the-art results for 11 games. Finally, we show that DQfD performs better than three related algorithms for incorporating demonstration data into DQN.

关键词

Reinforcement Learning; Learning from Demonstratinos

全文

PDF


Decentralized High-Dimensional Bayesian Optimization With Factor Graphs

摘要

This paper presents a novel decentralized high-dimensional Bayesian optimization (DEC-HBO) algorithm that, in contrast to existing HBO algorithms, can exploit the interdependent effects of various input components on the output of the unknown objective function f for boosting the BO performance and still preserve scalability in the number of input dimensions without requiring prior knowledge or the existence of a low (effective) dimension of the input space. To realize this, we propose a sparse yet rich factor graph representation of f to be exploited for designing an acquisition function that can be similarly represented by a sparse factor graph and hence be efficiently optimized in a decentralized manner using distributed message passing. Despite richly characterizing the interdependent effects of the input components on the output of f with a factor graph, DEC-HBO can still guarantee no-regret performance asymptotically. Empirical evaluation on synthetic and real-world experiments (e.g., sparse Gaussian process model with 1811 hyperparameters) shows that DEC-HBO outperforms the state-of-the-art HBO algorithms.

关键词

全文

PDF


A Deep Model With Local Surrogate Loss for General Cost-Sensitive Multi-Label Learning

摘要

Multi-label learning is an important machine learning problem with a wide range of applications. The variety of criteria for satisfying different application needs calls for cost-sensitive algorithms, which can adapt to different criteria easily. Nevertheless, because of the sophisticated nature of the criteria for multi-label learning, cost-sensitive algorithms for general criteria are hard to design, and current cost-sensitive algorithms can at most deal with some special types of criteria. In this work, we propose a novel cost-sensitive multi-label learning model for any general criteria. Our key idea within the model is to iteratively estimate a surrogate loss that approximates the sophisticated criterion of interest near some local neighborhood, and use the estimate to decide a descent direction for optimization. The key idea is then coupled with deep learning to form our proposed model. Experimental results validate that our proposed model is superior to existing cost-sensitive algorithms and existing deep learning models across different criteria.

关键词

全文

PDF


From Hashing to CNNs: Training Binary Weight Networks via Hashing

摘要

Deep convolutional neural networks (CNNs) have shown appealing performance on various computer vision tasks in recent years. This motivates people to deploy CNNs to real-world applications. However, most of state-of-art CNNs require large memory and computational resources, which hinders the deployment on mobile devices. Recent studies show that low-bit weight representation can reduce much storage and memory demand, and also can achieve efficient network inference. To achieve this goal, we propose a novel approach named BWNH to train Binary Weight Networks via Hashing. In this paper, we first reveal the strong connection between inner-product preserving hashing and binary weight networks, and show that training binary weight networks can be intrinsically regarded as a hashing problem. Based on this perspective, we propose an alternating optimization method to learn the hash codes instead of directly learning binary weights. Extensive experiments on CIFAR10, CIFAR100 and ImageNet demonstrate that our proposed BWNH outperforms current state-of-art by a large margin.

关键词

Hashing;Binary Weight Network;CNNs

全文

PDF


SNNN: Promoting Word Sentiment and Negation in Neural Sentiment Classification

摘要

We mainly investigate word influence in neural sentiment classification, which results in a novel approach to promoting word sentiment and negation as attentions. Particularly, a sentiment and negation neural network (SNNN) is proposed, including a sentiment neural network (SNN) and a negation neural network (NNN). First, we modify the word level by embedding the word sentiment and negation information as the extra layers for the input. Second, we adopt a hierarchical LSTM model to generate the word-level, sentence-level and document-level representations respectively. After that, we enhance word sentiment and negation as attentions over the semantic level. Finally, the experiments conducting on the IMDB and Yelp data sets show that our approach is superior to the state-of-the-art baselines. Furthermore, we draw the interesting conclusions that (1) LSTM performs better than CNN and RNN for neural sentiment classification; (2) word sentiment and negation are a strong alliance with attention, while overfitting occurs when they are simultaneously applied at the embedding layer; and (3) word sentiment/negation can be singly implemented for better performance as both embedding layer and attention at the same time.

关键词

Neural Sentiment Classification; Word Sentiment; Word Negation; LSTM

全文

PDF


On Convergence of Epanechnikov Mean Shift

摘要

Epanechnikov Mean Shift is a simple yet empirically very effective algorithm for clustering. It localizes the centroids of data clusters via estimating modes of the probability distribution that generates the data points, using the "optimal" Epanechnikov kernel density estimator. However, since the procedure involves non-smooth kernel density functions,the convergence behavior of Epanechnikov mean shift lacks theoretical support as of this writing---most of the existing analyses are based on smooth functions and thus cannot be applied to Epanechnikov Mean Shift. In this work, we first show that the original Epanechnikov Mean Shift may indeed terminate at a non-critical point, due to the non-smoothness nature. Based on our analysis, we propose a simple remedy to fix it. The modified Epanechnikov Mean Shift is guaranteed to terminate at a local maximum of the estimated density, which corresponds to a cluster centroid, within a inite number of iterations. We also propose a way to avoid running the Mean Shift iterates from every data point, while maintaining good clustering accuracies under non-overlapping spherical Gaussian mixture models. This further pushes Epanechnikov Mean Shift to handle very large and high-dimensional data sets. Experiments show surprisingly good performance compared to the Lloyd's K-means algorithm and the EM algorithm.

关键词

全文

PDF


Orthogonal Weight Normalization: Solution to Optimization Over Multiple Dependent Stiefel Manifolds in Deep Neural Networks

摘要

Orthogonal matrix has shown advantages in training Recurrent Neural Networks (RNNs), but such matrix is limited to be square for the hidden-to-hidden transformation in RNNs. In this paper, we generalize such square orthogonal matrix to orthogonal rectangular matrix and formulating this problem in feed-forward Neural Networks (FNNs) as Optimization over Multiple Dependent Stiefel Manifolds (OMDSM). We show that the orthogonal rectangular matrix can stabilize the distribution of network activations and regularize FNNs. We propose a novel orthogonal weight normalization method to solve OMDSM. Particularly, it constructs orthogonal transformation over proxy parameters to ensure the weight matrix is orthogonal. To guarantee stability, we minimize the distortions between proxy parameters and canonical weights over all tractable orthogonal transformations. In addition, we design orthogonal linear module (OLM) to learn orthogonal filter banks in practice, which can be used as an alternative to standard linear module. Extensive experiments demonstrate that by simply substituting OLM for standard linear module without revising any experimental protocols, our method improves the performance of the state-of-the-art networks, including Inception and residual networks on CIFAR and ImageNet datasets.

关键词

Deep Learning; Neural Networks; Orthogonal matrix; Optimization; Stiefel Manifolds

全文

PDF


Building Deep Networks on Grassmann Manifolds

摘要

Learning representations on Grassmann manifolds is popular in quite a few visual recognition tasks. In order to enable deep learning on Grassmann manifolds, this paper proposes a deep network architecture by generalizing the Euclidean network paradigm to Grassmann manifolds. In particular, we design full rank mapping layers to transform input Grassmannian data to more desirable ones, exploit re-orthonormalization layers to normalize the resulting matrices, study projection pooling layers to reduce the model complexity in the Grassmannian context, and devise projection mapping layers to respect Grassmannian geometry and meanwhile achieve Euclidean forms for regular output layers. To train the Grassmann networks, we exploit a stochastic gradient descent setting on manifolds of the connection weights, and study a matrix generalization of backpropagation to update the structured data. The evaluations on three visual recognition tasks show that our Grassmann networks have clear advantages over existing Grassmann learning methods, and achieve results comparable with state-of-the-art approaches.

关键词

Grassmann manifolds, Grassmann networks

全文

PDF


Accelerated Method for Stochastic Composition Optimization With Nonsmooth Regularization

摘要

Stochastic composition optimization draws much attention recently and has been successful in many emerging applications of machine learning, statistical analysis, and reinforcement learning. In this paper, we focus on the composition problem with nonsmooth regularization penalty. Previous works either have slow convergence rate, or do not provide complete convergence analysis for the general problem. In this paper, we tackle these two issues by proposing a new stochastic composition optimization method for composition problem with nonsmooth regularization penalty. In our method, we apply variance reduction technique to accelerate the speed of convergence. To the best of our knowledge, our method admits the fastest convergence rate for stochastic composition optimization: for strongly convex composition problem, our algorithm is proved to admit linear convergence; for general composition problem, our algorithm significantly improves the state-of-the-art convergence rate from O(T–1/2) to O((n1+n2)2/3T-1). Finally, we apply our proposed algorithm to portfolio management and policy evaluation in reinforcement learning. Experimental results verify our theoretical analysis.

关键词

Stochastic Composition Optimization; Variance Reduction; Large-Scale Optimization

全文

PDF


Product Quantized Translation for Fast Nearest Neighbor Search

摘要

This paper proposes a simple nearest neighbor search algorithm, which provides the exact solution in terms of the Euclidean distance efficiently. Especially, we present an interesting approach to improve the speed of nearest neighbor search by proper translations of data and query although the task is inherently invariant to the Euclidean transformations. The proposed algorithm aims to eliminate nearest neighbor candidates effectively using their distance lower bounds in nonlinear embedded spaces, and further improves the lower bounds by transforming data and query through product quantized translations. Although our framework is composed of simple operations only, it achieves the state-of-the-art performance compared to existing nearest neighbor search techniques, which is illustrated quantitatively using various large-scale benchmark datasets in different sizes and dimensions.

关键词

Nearest Neighbor Search

全文

PDF


Selective Experience Replay for Lifelong Learning

摘要

Deep reinforcement learning has emerged as a powerful tool for a variety of learning tasks, however deep nets typically exhibit forgetting when learning multiple tasks in sequence. To mitigate forgetting, we propose an experience replay process that augments the standard FIFO buffer and selectively stores experiences in a long-term memory. We explore four strategies for selecting which experiences will be stored: favoring surprise, favoring reward, matching the global training distribution, and maximizing coverage of the state space. We show that distribution matching successfully prevents catastrophic forgetting, and is consistently the best approach on all domains tested. While distribution matching has better and more consistent performance, we identify one case in which coverage maximization is beneficial---when tasks that receive less trained are more important. Overall, our results show that selective experience replay, when suitable selection algorithms are employed, can prevent catastrophic forgetting.

关键词

Lifelong Machine Learning; Transfer Learning; Multi-task Learning; Experience Replay

全文

PDF


Label Distribution Learning by Exploiting Label Correlations

摘要

Label distribution learning (LDL) is a newly arisen machine learning method that has been increasingly studied in recent years. In theory, LDL can be seen as a generalization of multi-label learning. Previous studies have shown that LDL is an effective approach to solve the label ambiguity problem. However, the dramatic increase in the number of possible label sets brings a challenge in performance to LDL. In this paper, we propose a novel label distribution learning algorithm to address the above issue. The key idea is to exploit correlations between different labels. We encode the label correlation into a distance to measure the similarity of any two labels. Moreover, we construct a distance-mapping function from the label set to the parameter matrix. Experimental results on eight real label distributed data sets demonstrate that the proposed algorithm performs remarkably better than both the state-of-the-art LDL methods and multi-label learning methods.

关键词

Label distribution learning

全文

PDF


Metric-Based Auto-Instructor for Learning Mixed Data Representation

摘要

Mixed data with both categorical and continuous features are ubiquitous in real-world applications. Learning a good representation of mixed data is critical yet challenging for further learning tasks. Existing methods for representing mixed data often overlook the heterogeneous coupling relationships between categorical and continuous features as well as the discrimination between objects. To address these issues, we propose an auto-instructive representation learning scheme to enable margin-enhanced distance metric learning for a discrimination-enhanced representation. Accordingly, we design a metric-based auto-instructor (MAI) model which consists of two collaborative instructors. Each instructor captures the feature-level couplings in mixed data with fully connected networks, and guides the infinite-margin metric learning for the peer instructor with a contrastive order. By feeding the learned representation into both partition-based and density-based clustering methods, our experiments on eight UCI datasets show highly significant learning performance improvement and much more distinguishable visualization outcomes over the baseline methods.

关键词

representation learning; mixed data; metric learning

全文

PDF


Efficient Multi-Dimensional Tensor Sparse Coding Using t-Linear Combination

摘要

In this paper, we propose two novel multi-dimensional tensor sparse coding (MDTSC) schemes using the t-linear combination. Based on the t-linear combination, the shifted versions of the bases are used for the data approximation, but without need to store them. Therefore, the dictionaries of the proposed schemes are more concise and the coefficients have richer physical explanations. Moreover, we propose an efficient alternating minimization algorithm, including the tensor coefficient learning and the tensor dictionary learning, to solve the proposed problems. For the tensor coefficient learning, we design a tensor-based fast iterative shrinkage algorithm. For the tensor dictionary learning, we first divide the problem into several nearly-independent subproblems in the frequency domain, and then utilize the Lagrange dual to further reduce the number of optimization variables. Experimental results on multi-dimensional signals denoising and reconstruction (3DTSC, 4DTSC, 5DTSC) show that the proposed algorithms are more efficient and outperform the state-of-the-art tensor-based sparse coding models.

关键词

Tensor sparse coding; t-linear combination; Small-size dictionary; Shifting invariance

全文

PDF


PAC Reinforcement Learning With an Imperfect Model

摘要

Reinforcement learning (RL) methods have proved to be successful in many simulated environments. The common approaches, however, are often too sample intensive to be applied directly in the real world. A promising approach to addressing this issue is to train an RL agent in a simulator and transfer the solution to the real environment. When a high-fidelity simulator is available we would expect significant reduction in the amount of real trajectories needed for learning. In this work we aim at better understanding the theoretical nature of this approach. We start with a perhaps surprising result that, even if the approximate model (e.g., a simulator) only differs from the real environment in a single state-action pair (but which one is unknown), such a model could be information-theoretically useless and the sample complexity (in terms of real trajectories) still scales with the total number of states in the worst case. We investigate the hard instances and come up with natural conditions that avoid the pathological situations. We then propose two conceptually simple algorithms that enjoy polynomial sample complexity guarantees with no dependence on the size of the state-action space, and prove some foundational results to provide insights into this important problem.

关键词

reinforcement learning; PAC-MDP; exploration; sim-to-real transfer

全文

PDF


Asymmetric Deep Supervised Hashing

摘要

Hashing has been widely used for large-scale approximate nearest neighbor search because of its storage and search efficiency. Recent work has found that deep supervised hashing can significantly outperform non-deep supervised hashing in many applications. However, most existing deep supervised hashing methods adopt a symmetric strategy to learn one deep hash function for both query points and database (retrieval) points. The training of these symmetric deep supervised hashing methods is typically time-consuming, which makes them hard to effectively utilize the supervised information for cases with large-scale database. In this paper, we propose a novel deep supervised hashing method, called asymmetric deep supervised hashing (ADSH), for large-scale nearest neighbor search. ADSH treats the query points and database points in an asymmetric way. More specifically, ADSH learns a deep hash function only for query points, while the hash codes for database points are directly learned. The training of ADSH is much more efficient than that of traditional symmetric deep supervised hashing methods. Experiments show that ADSH can achieve state-of-the-art performance in real applications.

关键词

Deep hashing; Large-scale retrieval; Image retrieval

全文

PDF


On Controlling the Size of Clusters in Probabilistic Clustering

摘要

Classical model-based partitional clustering algorithms, such ask-means or mixture of Gaussians, provide only loose and indirect control over the size of the resulting clusters. In this work, we present a family of probabilistic clustering models that can be steered towards clusters of desired size by providing a prior distribution over the possible sizes, allowing the analyst to fine-tune exploratory analysis or to produce clusters of suitable size for future down-stream processing.Our formulation supports arbitrary multimodal prior distributions, generalizing the previous work on clustering algorithms searching for clusters of equal size or algorithms designed for the microclustering task of finding small clusters. We provide practical methods for solving the problem, using integer programming for making the cluster assignments, and demonstrate that we can also automatically infer the number of clusters.

关键词

Clustering

全文

PDF


Less-Forgetful Learning for Domain Expansion in Deep Neural Networks

摘要

Expanding the domain that deep neural network has already learned without accessing old domain data is a challenging task because deep neural networks forget previously learned information when learning new data from a new domain. In this paper, we propose a less-forgetful learning method for the domain expansion scenario. While existing domain adaptation techniques solely focused on adapting to new domains, the proposed technique focuses on working well with both old and new domains without needing to know whether the input is from the old or new domain. First, we present two naive approaches which will be problematic, then we provide a new method using two proposed properties for less-forgetful learning. Finally, we prove the effectiveness of our method through experiments on image classification tasks. All datasets used in the paper, will be released on our website for someone's follow-up study.

关键词

deep learning; image classification; domain expansion; catastrophic forgetting; continual learning

全文

PDF


Unified Spectral Clustering With Optimal Graph

摘要

Spectral clustering has found extensive use in many areas. Most traditional spectral clustering algorithms work in three separate steps: similarity graph construction; continuous labels learning; discretizing the learned labels by k-means clustering. Such common practice has two potential flaws, which may lead to severe information loss and performance degradation. First, predefined similarity graph might not be optimal for subsequent clustering. It is well-accepted that similarity graph highly affects the clustering results. To this end, we propose to automatically learn similarity information from data and simultaneously consider the constraint that the similarity matrix has exact c connected components if there are c clusters. Second, the discrete solution may deviate from the spectral solution since k-means method is well-known as sensitive to the initialization of cluster centers. In this work, we transform the candidate solution into a new one that better approximates the discrete one. Finally, those three subtasks are integrated into a unified framework, with each subtask iteratively boosted by using the results of the others towards an overall optimal solution. It is known that the performance of a kernel method is largely determined by the choice of kernels. To tackle this practical problem of how to select the most suitable kernel for a particular data set, we further extend our model to incorporate multiple kernel learning ability. Extensive experiments demonstrate the superiority of our proposed method as compared to existing clustering approaches.

关键词

spectral clustering;graph construction;kernel method

全文

PDF


Batchwise Patching of Classifiers

摘要

In this work we present classifier patching, an approach for adapting an existing black-box classification model to new data. Instead of creating a new model, patching infers regions in the instance space where the existing model is error-prone by training a classifier on the previously misclassified data. It then learns a specific model to determine the error regions, which allows to patch the old model’s predictions for them. Patching relies on a strong, albeit unchangeable, existing base classifier, and the idea that the true labels of seen instances will be available in batches at some point in time after the original classification. We experimentally evaluate our approach, and show that it meets the original design goals. Moreover, we compare our approach to existing methods from the domain of ensemble stream classification in both concept drift and transfer learning situations. Patching adapts quickly and achieves high classification accuracy, outperforming state-of-the-art competitors in either adaptation speed or accuracy in many scenarios.

关键词

Classifier adaptation; concept drift; transfer learning

全文

PDF


Deep Semi-Random Features for Nonlinear Function Approximation

摘要

We propose semi-random features for nonlinear function approximation. The flexibility of semi-random feature lies between the fully adjustable units in deep learning and the random features used in kernel methods. For one hidden layer models with semi-random features, we prove with no unrealistic assumptions that the model classes contain an arbitrarily good function as the width increases (universality), and despite non-convexity, we can find such a good function (optimization theory) that generalizes to unseen new data (generalization bound). For deep models, with no unrealistic assumptions, we prove universal approximation ability, a lower bound on approximation error, a partial optimization guarantee, and a generalization bound. Depending on the problems, the generalization bound of deep semi-random features can be exponentially better than the known bounds of deep ReLU nets; our generalization error bound can be independent of the depth, the number of trainable weights as well as the input dimensionality. In experiments, we show that semi-random features can match the performance of neural networks by using slightly more units, and it outperforms random features by using significantly fewer units. Moreover, we introduce a new implicit ensemble method by using semi-random features.

关键词

Kernel Methods; Neural Networks; Random Features

全文

PDF


Measuring Catastrophic Forgetting in Neural Networks

摘要

Deep neural networks are used in many state-of-the-art systems for machine perception. Once a network is trained to do a specific task, e.g., bird classification, it cannot easily be trained to do new tasks, e.g., incrementally learning to recognize additional bird species or learning an entirely different task such as flower recognition. When new tasks are added, typical deep neural networks are prone to catastrophically forgetting previous tasks. Networks that are capable of assimilating new information incrementally, much like how humans form new memories over time, will be more efficient than re-training the model from scratch each time a new task needs to be learned. There have been multiple attempts to develop schemes that mitigate catastrophic forgetting, but these methods have not been directly compared, the tests used to evaluate them vary considerably, and these methods have only been evaluated on small-scale problems (e.g., MNIST). In this paper, we introduce new metrics and benchmarks for directly comparing five different mechanisms designed to mitigate catastrophic forgetting in neural networks: regularization, ensembling, rehearsal, dual-memory, and sparse-coding. Our experiments on real-world images and sounds show that the mechanism(s) that are critical for optimal performance vary based on the incremental training paradigm and type of data being used, but they all demonstrate that the catastrophic forgetting problem is not yet solved.

关键词

catastrophic forgetting; neural networks; lifelong learning; deep learning; machine learning; supervised learning; incremental learning

全文

PDF


Approximate Vanishing Ideal via Data Knotting

摘要

The vanishing ideal is a set of polynomials that takes zero value on the given data points. Originally proposed in computer algebra, the vanishing ideal has been recently exploited for extracting the nonlinear structures of data in many applications. To avoid overfitting to noisy data, the polynomials are often designed to approximately rather than exactly equal zero on the designated data. Although such approximations empirically demonstrate high performance, the sound algebraic structure of the vanishing ideal is lost. The present paper proposes a vanishing ideal that is tolerant to noisy data and also pursued to have a better algebraic structure. As a new problem, we simultaneously find a set of polynomials and data points for which the polynomials approximately vanish on the input data points, and almost exactly vanish on the discovered data points. In experimental classification tests, our method discovered much fewer and lower-degree polynomials than an existing state-of-the-art method. Consequently, our method accelerated the runtime of the classification tasks without degrading the classification accuracy.

关键词

Computer Algebra; Vanishing Ideal

全文

PDF


Feature Engineering for Predictive Modeling Using Reinforcement Learning

摘要

Feature engineering is a crucial step in the process of predictive modeling. It involves the transformation of given feature space, typically using mathematical functions, with the objective of reducing the modeling error for a given target. However, there is no well-defined basis for performing effective feature engineering. It involves domain knowledge, intuition, and most of all, a lengthy process of trial and error. The human attention involved in overseeing this process significantly influences the cost of model generation. We present a new framework to automate feature engineering. It is based on performance driven exploration of a transformation graph, which systematically and compactly captures the space of given options. A highly efficient exploration strategy is derived through reinforcement learning on past examples.

关键词

Feature Engineering; Predictive Modeling; Supervised Learning; Reinforcement Learning

全文

PDF


Imitation Learning via Kernel Mean Embedding

摘要

Imitation learning refers to the problem where an agent learns a policy that mimics the demonstration provided by the expert, without any information on the cost function of the environment. Classical approaches to imitation learning usually rely on a restrictive class of cost functions that best explains the expert's demonstration, exemplified by linear functions of pre-defined features on states and actions. We show that the kernelization of a classical algorithm naturally reduces the imitation learning to a distribution learning problem, where the imitation policy tries to match the state-action visitation distribution of the expert. Closely related to our approach is the recent work on leveraging generative adversarial networks (GANs) for imitation learning, but our reduction to distribution learning is much simpler, robust to scarce expert demonstration, and sample efficient. We demonstrate the effectiveness of our approach on a wide range of high-dimensional control tasks.

关键词

Imitation Learning; Kernel Mean Embedding

全文

PDF


On the Optimal Bit Complexity of Circulant Binary Embedding

摘要

Binary embedding refers to methods for embedding points in Rd into vertices of a Hamming cube of dimension k, such that the normalized Hamming distance well preserves the pre-defined similarity between vectors in the original space. A common approach to binary embedding is to use random projection with unstructured projection, followed by one-bit quantization to produce binary codes, which has been proven that k = O(ε-2 log n) is required to approximate the angle up to epsilon-distortion, where n is the number of data. Of particular interest in this paper is circulant binary embedding (CBE) with angle preservation, where a random circulant matrix is used for projection. It yields comparable performance while achieving the nearly linear time and space complexities, compared to embedding methods relying on unstructured projection. To support promising empirical results, several non-asymptotic analysis have been introduced to establish conditions on the number of bits to meet epsilon-distortion embedding, where one of state-of-the-art achieves the optimal sample complexity k = O(ε-3 log n) while the distortion rate ε-3 is far from the optimality, compared to k = O(ε-2 log n). In this paper, to support promising empirical results of CBE, we extend the previous theoretical framework to address the optimal condition on the number of bits, achieving that CBE with k = O(ε-2 log n) approximates the angle up to ε-distortion under mild assumptions. We also provide numerical experiments to support our theoretical results.

关键词

Binary Embedding; Circulant Matrix

全文

PDF


Joint Dictionaries for Zero-Shot Learning

摘要

A classic approach toward zero-shot learning (ZSL) is to map the input domain to a set of semantically meaningful attributes that could be used later on to classify unseen classes of data (e.g. visual data). In this paper, we propose to learn a visual feature dictionary that has semantically meaningful atoms. Such a dictionary is learned via joint dictionary learning for the visual domain and the attribute domain, while enforcing the same sparse coding for both dictionaries. Our novel attribute aware formulation provides an algorithmic solution to the domain shift/hubness problem in ZSL. Upon learning the joint dictionaries, images from unseen classes can be mapped into the attribute space by finding the attribute aware joint sparse representation using solely the visual data. We demonstrate that our approach provides superior or comparable performance to that of the state of the art on benchmark datasets.

关键词

Zero-Shot Learning; Joint Dictionary Learning

全文

PDF


Dialogue Act Sequence Labeling Using Hierarchical Encoder With CRF

摘要

Dialogue Act recognition associate dialogue acts (i.e., semantic labels) to utterances in a conversation. The problem of associating semantic labels to utterances can be treated as a sequence labeling problem. In this work, we build a hierarchical recurrent neural network using bidirectional LSTM as a base unit and the conditional random field (CRF) as the top layer to classify each utterance into its corresponding dialogue act. The hierarchical network learns representations at multiple levels, i.e., word level, utterance level, and conversation level. The conversation level representations are input to the CRF layer, which takes into account not only all previous utterances but also their dialogue acts, thus modeling the dependency among both, labels and utterances, an important consideration of natural dialogue. We validate our approach on two different benchmark data sets, Switchboard and Meeting Recorder Dialogue Act, and show performance improvement over the state-of-the-art methods by 2.2% and 4.1% absolute points, respectively. It is worth noting that the inter-annotator agreement on Switchboard data set is 84%, and our method is able to achieve the accuracy of about 79% despite being trained on the noisy data.

关键词

CRF; LSTM; dialog act sequence labeling

全文

PDF


gOCCF: Graph-Theoretic One-Class Collaborative Filtering Based on Uninteresting Items

摘要

We investigate how to address the shortcomings of the popular One-Class Collaborative Filtering (OCCF) methods in handling challenging “sparse” dataset in one-class setting (e.g., clicked or bookmarked), and propose a novel graph-theoretic OCCF approach, named as gOCCF, by exploiting both positive preferences (derived from rated items) as well as negative preferences (derived from unrated items). In capturing both positive and negative preferences as a bipartite graph, further, we apply the graph shattering theory to determine the right amount of negative preferences to use. Then, we develop a suite of novel graph-based OCCF methods based on the random walk with restart and belief propagation methods. Through extensive experiments using 3 real-life datasets, we show that our gOCCF effectively addresses the sparsity challenge and significantly outperforms all of 8 competing methods in accuracy on very sparse datasets while providing comparable accuracy to the best performing OCCF methods on less sparse datasets. The datasets and implementations used in the empirical validation are available for access: https://goo.gl/sfiawn.

关键词

Recommender Systems; One-Class Collaborative Filtering; Data Sparsity Problem

全文

PDF


On Value Function Representation of Long Horizon Problems

摘要

In Reinforcement Learning, an intelligent agent has to make a sequence of decisions to accomplish a goal. If this sequence is long, then the agent has to plan over a long horizon. While learning the optimal policy and its value function is a well studied problem in Reinforcement Learning, this paper focuses on the structure of the optimal value function and how hard it is to represent the optimal value function. We show that the generalized Rademacher complexity of the hypothesis space of all optimal value functions is dependent on the planning horizon and independent of the state and action space size. Further, we present bounds on the action-gaps of action value functions and show that they can collapse if a long planning horizon is used. The theoretical results are verified empirically on randomly generated MDPs and on a grid-world fruit collection task using deep value function approximation. Our theoretical results highlight a connection between value function approximation and the Options framework and suggest that value functions should be decomposed along bottlenecks of the MDP's transition dynamics.

关键词

Reinforcement Learning

全文

PDF


Extremely Low Bit Neural Network: Squeeze the Last Bit Out With ADMM

摘要

Although deep learning models are highly effective for various learning tasks, their high computational costs prohibit the deployment to scenarios where either memory or computational resources are limited. In this paper, we focus on compressing and accelerating deep models with network weights represented by very small numbers of bits, referred to as extremely low bit neural network. We model this problem as a discretely constrained optimization problem. Borrowing the idea from Alternating Direction Method of Multipliers (ADMM), we decouple the continuous parameters from the discrete constraints of network, and cast the original hard problem into several subproblems. We propose to solve these subproblems using extragradient and iterative quantization algorithms that lead to considerably faster convergency compared to conventional optimization methods. Extensive experiments on image recognition and object detection verify that the proposed algorithm is more effective than state-of-the-art approaches when coming to extremely low bit neural network.

关键词

Low Bits Quantization

全文

PDF


Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces

摘要

Policy optimization methods have shown great promise in solving complex reinforcement and imitation learning tasks. While model-free methods are broadly applicable, they often require many samples to optimize complex policies. Model-based methods greatly improve sample-efficiency but at the cost of poor generalization, requiring a carefully handcrafted model of the system dynamics for each task. Recently, hybrid methods have been successful in trading off applicability for improved sample-complexity. However, these have been limited to continuous action spaces. In this work, we present a new hybrid method based on an approximation of the dynamics as an expectation over the next state under the current policy. This relaxation allows us to derive a novel hybrid policy gradient estimator, combining score function and pathwise derivative estimators, that is applicable to discrete action spaces. We show significant gains in sample complexity, ranging between 1.7 and 25 times, when learning parameterized policies on Cart Pole, Acrobot, Mountain Car and Hand Mass. Our method is applicable to both discrete and continuous action spaces, when competing pathwise methods are limited to the latter.

关键词

reinforcement; learning; deep; discrete; action; spaces; sample; efficient; continuous; relaxation

全文

PDF


Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition

摘要

Variations of human body skeletons may be considered as dynamic graphs, which are generic data representation for numerous real-world applications. In this paper, we propose a spatio-temporal graph convolution (STGC) approach for assembling the successes of local convolutional filtering and sequence learning ability of autoregressive moving average. To encode dynamic graphs, the constructed multi-scale local graph convolution filters, consisting of matrices of local receptive fields and signal mappings, are recursively performed on structured graph data of temporal and spatial domain. The proposed model is generic and principled as it can be generalized into other dynamic models. We theoretically prove the stability of STGC and provide an upper-bound of the signal transformation to be learnt. Further, the proposed recursive model can be stacked into a multi-layer architecture. To evaluate our model, we conduct extensive experiments on four benchmark skeleton-based action datasets, including the large-scale challenging NTU RGB+D. The experimental results demonstrate the effectiveness of our proposed model and the improvement over the state-of-the-art.

关键词

全文

PDF


Learning to Generalize: Meta-Learning for Domain Generalization

摘要

Domain shift refers to the well known problem that a model trained in one source domain performs poorly when appliedto a target domain with different statistics. Domain Generalization (DG) techniques attempt to alleviate this issue by producing models which by design generalize well to novel testing domains. We propose a novel meta-learning method for domain generalization. Rather than designing a specific model that is robust to domain shift as in most previous DG work, we propose a model agnostic training procedure for DG. Our algorithm simulates train/test domain shift during training by synthesizing virtual testing domains within each mini-batch. The meta-optimization objective requires that steps to improve training domain performance should also improve testing domain performance. This meta-learning procedure trains models with good generalization ability to novel domains. We evaluate our method and achieve state of the art results on a recent cross-domain image classification benchmark, as well demonstrating its potential on two classic reinforcement learning tasks.

关键词

Meta-Learning; Domain Generalization

全文

PDF


A Probabilistic Hierarchical Model for Multi-View and Multi-Feature Classification

摘要

Some recent works in classification show that the data obtained from various views with different sensors for an object contributes to achieving a remarkable performance. Actually, in many real-world applications, each view often contains multiple features, which means that this type of data has a hierarchical structure, while most of existing works do not take these features with multi-layer structure into consideration simultaneously. In this paper, a probabilistic hierarchical model is proposed to address this issue and applied for classification. In our model, a latent variable is first learned to fuse the multiple features obtained from a same view, sensor or modality. Particularly, mapping matrices corresponding to a certain view are estimated to project the latent variable from a shared space to the multiple observations. Since this method is designed for the supervised purpose, we assume that the latent variables associated with different views are influenced by their ground-truth label. In order to effectively solve the proposed method, the Expectation-Maximization (EM) algorithm is applied to estimate the parameters and latent variables. Experimental results on the extensive synthetic and two real-world datasets substantiate the effectiveness and superiority of our approach as compared with state-of-the-art.

关键词

multi-view; multi-feature; hierarchical model

全文

PDF


Predictive Coding Machine for Compressed Sensing and Image Denoising

摘要

Sparse and low rank coding has widely received much attention in machine learning, multimedia and computer vision. Unfortunately, expensive inference restricts the power of coding models in real-world applications, e.g., compressed sensing and image deblurring. In order to avoid the expensive inference, we propose a predictive coding machine (PCM) which aims to train a deep neural network (DNN) encoder to approximate the codes. By this means, a test sample can be fast approximated by the well-trained DNN. However, DNN leads PCM to be a non-convex and non-smooth optimization problem, which is extremely hard to solve. To address this challenge, we extend accelerated proximal gradient for PCM by steering gradient descent of DNN. To the best of our knowledge, we are the first to propose a gradient descent algorithm guided by accelerated proximal gradient for solving the PCM problem. Besides, a sufficient condition is provided to ensure the convergence to a critical point. Moreover, when the coding models are convex in PCM, the convergence rate O(1/(m2√t)) can be held in which m is the iteration number of accelerated proximal gradient, and t is the epoch of training DNN. Numerical results verify the promising advantages of PCM in terms of effectiveness, efficiency and robustness.

关键词

sparse coding; neural networks; convex optimization; compressed sensing; image denoising

全文

PDF


Unsupervised Personalized Feature Selection

摘要

Feature selection is effective in preparing high-dimensional data for a variety of learning tasks such as classification, clustering and anomaly detection. A vast majority of existing feature selection methods assume that all instances share some common patterns manifested in a subset of shared features. However, this assumption is not necessarily true in many domains where data instances could show high individuality. For example, in the medical domain, we need to capture the heterogeneous nature of patients for personalized predictive modeling, which could be characterized by a subset of instance-specific features. Motivated by this, we propose to study a novel problem of personalized feature selection. In particular, we investigate the problem in an unsupervised scenario as label information is usually hard to obtain in practice. To be specific, we present a novel unsupervised personalized feature selection framework UPFS to find some shared features by all instances and instance-specific features tailored to each instance. We formulate the problem into a principled optimization framework and provide an effective algorithm to solve it. Experimental results on real-world datasets verify the effectiveness of the proposed UPFS framework.

关键词

Feature Selection; Personalized Algorithms; Unsupervised Learning

全文

PDF


Latent Discriminant Subspace Representations for Multi-View Outlier Detection

摘要

Identifying multi-view outliers is challenging because of the complex data distributions across different views. Existing methods cope this problem by exploiting pairwise constraints across different views to obtain new feature representations,based on which certain outlier score measurements are defined. Due to the use of pairwise constraint, it is complicated and time-consuming for existing methods to detect outliers from three or more views. In this paper, we propose a novel method capable of detecting outliers from any number of dataviews. Our method first learns latent discriminant representations for all view data and defines a novel outlier score function based on the latent discriminant representations. Specifically, we represent multi-view data by a global low-rank representation shared by all views and residual representations specific to each view. Through analyzing the view-specific residual representations of all views, we can get the outlier score for every sample. Moreover, we raise the problem of detectinga third type of multi-view outliers which are neglected by existing methods. Experiments on six datasets show our method outperforms the existing ones in identifying all types of multi-view outliers, often by large margins.

关键词

全文

PDF


Deep Learning for Case-Based Reasoning Through Prototypes: A Neural Network That Explains Its Predictions

摘要

Deep neural networks are widely used for classification. These deep models often suffer from a lack of interpretability---they are particularly difficult to understand because of their non-linear nature. As a result, neural networks are often treated as "black box" models, and in the past, have been trained purely to optimize the accuracy of predictions. In this work, we create a novel network architecture for deep learning that naturally explains its own reasoning for each prediction. This architecture contains an autoencoder and a special prototype layer, where each unit of that layer stores a weight vector that resembles an encoded training input. The encoder of the autoencoder allows us to do comparisons within the latent space, while the decoder allows us to visualize the learned prototypes. The training objective has four terms: an accuracy term, a term that encourages every prototype to be similar to at least one encoded input, a term that encourages every encoded input to be close to at least one prototype, and a term that encourages faithful reconstruction by the autoencoder. The distances computed in the prototype layer are used as part of the classification process. Since the prototypes are learned during training, the learned network naturally comes with explanations for each prediction, and the explanations are loyal to what the network actually computes.

关键词

Interpretable Machine Learning, Case-Based Reasoning, Prototype Learning, Deep Learning

全文

PDF


Deeper Insights Into Graph Convolutional Networks for Semi-Supervised Learning

摘要

Many interesting problems in machine learning are being revisited with new deep learning tools. For graph-based semi-supervised learning, a recent important development is graph convolutional networks (GCNs), which nicely integrate local vertex features and graph topology in the convolutional layers. Although the GCN model compares favorably with other state-of-the-art methods, its mechanisms are not clear and it still requires considerable amount of labeled data for validation and model selection. In this paper, we develop deeper insights into the GCN model and address its fundamental limits. First, we show that the graph convolution of the GCN model is actually a special form of Laplacian smoothing, which is the key reason why GCNs work, but it also brings potential concerns of over-smoothing with many convolutional layers. Second, to overcome the limits of the GCN model with shallow architectures, we propose both co-training and self-training approaches to train GCNs. Our approaches significantly improve GCNs in learning with very few labels, and exempt them from requiring additional labels for validation. Extensive experiments on benchmarks have verified our theory and proposals.

关键词

graph convolutional networks; semi-supervised learning; deep learning; graph-based learning

全文

PDF


Adaptive Graph Convolutional Neural Networks

摘要

Graph Convolutional Neural Networks (Graph CNNs) are generalizations of classical CNNs to handle graph data such as molecular data, point could and social networks. Current filters in graph CNNs are built for fixed and shared graph structure. However, for most real data, the graph structures varies in both size and connectivity. The paper proposes a generalized and flexible graph CNN taking data of arbitrary graph structure as input. In that way a task-driven adaptive graph is learned for each graph data while training. To efficiently learn the graph, a distance metric learning is proposed. Extensive experiments on nine graph-structured datasets have demonstrated the superior performance improvement on both convergence speed and predictive accuracy.

关键词

graph CNN;spectral filter;metric learning

全文

PDF


Online Clustering of Contextual Cascading Bandits

摘要

We consider a new setting of online clustering of contextual cascading bandits, an online learning problem where the underlying cluster structure over users is unknown and needs to be learned from a random prefix feedback. More precisely, a learning agent recommends an ordered list of items to a user, who checks the list and stops at the first satisfactory item, if any. We propose an algorithm of CLUB-cascade for this setting and prove an n-step regret bound of order O(√n). Previous work corresponds to the degenerate case of only one cluster, and our general regret bound in this special case also significantly improves theirs. We conduct experiments on both synthetic and real data, and demonstrate the effectiveness of our algorithm and the advantage of incorporating online clustering method.

关键词

全文

PDF


An Optimal Online Method of Selecting Source Policies for Reinforcement Learning

摘要

Transfer learning significantly accelerates the reinforcement learning process by exploiting relevant knowledge from previous experiences. The problem of optimally selecting source policies during the learning process is of great importance yet challenging. There has been little theoretical analysis of this problem. In this paper, we develop an optimal online method to select source policies for reinforcement learning. This method formulates online source policy selection as a multi-armed bandit problem and augments Q-learning with policy reuse. We provide theoretical guarantees of the optimal selection process and convergence to the optimal policy. In addition, we conduct experiments on a grid-based robot navigation domain to demonstrate its efficiency and robustness by comparing to the state-of-the-art transfer learning method.

关键词

reinforcement learning, transfer learning

全文

PDF


Statistical Inference Using SGD

摘要

We present a novel method for frequentist statistical inference in M-estimation problems, based on stochastic gradient descent (SGD) with a fixed step size: we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the Ornstein-Uhlenbeck process suggests that such averages are asymptotically normal. To show the merits of our scheme, we apply it to both synthetic and real data sets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation.

关键词

statistical inference; confidence interval; hypothesis testing; stochastic gradient descent;

全文

PDF


Domain Generalization via Conditional Invariant Representations

摘要

Domain generalization aims to apply knowledge gained from multiple labeled source domains to unseen target domains. The main difficulty comes from the dataset bias: training data and test data have different distributions, and the training set contains heterogeneous samples from different distributions. Let X denote the features, and Y be the class labels. Existing domain generalization methods address the dataset bias problem by learning a domain-invariant representation h(X) that has the same marginal distribution P(h(X)) across multiple source domains. The functional relationship encoded in P(Y|X) is usually assumed to be stable across domains such that P(Y|h(X)) is also invariant. However, it is unclear whether this assumption holds in practical problems. In this paper, we consider the general situation where both P(X) and P(Y|X) can change across all domains. We propose to learn a feature representation which has domain-invariant class conditional distributions P(h(X)|Y). With the conditional invariant representation, the invariance of the joint distribution P(h(X),Y) can be guaranteed if the class prior P(Y) does not change across training and test domains. Extensive experiments on both synthetic and real data demonstrate the effectiveness of the proposed method.

关键词

全文

PDF


Learning With Incomplete Labels

摘要

For many real-world tagging problems, training labels are usually obtained through social tagging and are notoriously incomplete. Consequently, handling data with incomplete labels has become a difficult challenge, which usually leads to a degenerated performance on label prediction. To improve the generalization performance, in this paper, we first propose the Improved Cross-View learning (referred as ICVL) model, which considers both global and local patterns of label relationship to enrich the original label set. Further, by extending the ICVL model with an outlier detection mechanism, we introduce the Improved Cross-View learning with Outlier Detection (referred as ICVL-OD) model to remove the abnormal tags resulting from label enrichment. Extensive evaluations on three benchmark datasets demonstrate that ICVL and ICVL-OD outstand with superior performances in comparison with the competing methods.

关键词

全文

PDF


Balanced Clustering via Exclusive Lasso: A Pragmatic Approach

摘要

Clustering is an effective technique in data mining to generate groups that are the matter of interest.Among various clustering approaches, the family of k-means algorithms and min-cut algorithms gain most popularity due to their simplicity and efficacy. The classical k-means algorithm partitions a number of data points into several subsets by iteratively updating the clustering centers and the associated data points. By contrast, a weighted undirected graph is constructed in min-cut algorithms which partition the vertices of the graph into two sets. However, existing clustering algorithms tend to cluster minority of data points into a subset, which shall be avoided when the target dataset is balanced. To achieve more accurate clustering for balanced dataset, we propose to leverage exclusive lasso on k-means and min-cut to regulate the balance degree of the clustering results. By optimizing our objective functions that build atop the exclusive lasso, we can make the clustering result as much balanced as possible. Extensive experiments on several large-scale datasets validate the advantage of the proposed algorithms compared to the state-of-the-art clustering algorithms.

关键词

Balanced Clustering; k-means; Min-Cut

全文

PDF


Robust Formulation for PCA: Avoiding Mean Calculation With L2,p-norm Maximization

摘要

Most existing robust principal component analysis (PCA) involve mean estimation for extracting low-dimensional representation. However, they do not get the optimal mean for real data, which include outliers, under the different robust distances metric learning, such as L1-norm and L2,1-norm. This affects the robustness of algorithms. Motivated by the fact that the variance of data can be characterized by the variation between each pair of data, we propose a novel robust formulation for PCA. It avoids computing the mean of data in the criterion function. Our method employs L2,p-norm as the distance metric to measure the variation in the criterion function and aims to seek the projection matrix that maximizes the sum of variation between each pair of the projected data. Both theoretical analysis and experimental results demonstrate that our methods are efficient and superior to most existing robust methods for data reconstruction.

关键词

全文

PDF


CoDiNMF: Co-Clustering of Directed Graphs via NMF

摘要

Co-clustering computes clusters of data items and the related features concurrently, and it has been used in many applications such as community detection, product recommendation, computer vision, and pricing optimization. In this paper, we propose a new co-clustering method, called CoDiNMF, which improves the clustering quality and finds directional patterns among co-clusters by using multiple directed and undirected graphs. We design the objective function of co-clustering by using min-cut criterion combined with an additional term which controls the sum of net directional flow between different co-clusters. In addition, we show that a variant of Nonnegative Matrix Factorization (NMF) can solve the proposed objective function effectively. We run experiments on the US patents and BlogCatalog data sets whose ground truth have been known, and show that CoDiNMF improves clustering results compared to other co-clustering methods in terms of average F1 score, Rand index, and adjusted Rand index (ARI). Finally, we compare CoDiNMF and other co-clustering methods on the Wikipedia data set of philosophers, and we can find meaningful directional flow of influence among co-clusters of philosophers.

关键词

Clustering; Data Mining and Knowledge Discovery

全文

PDF


Transferable Contextual Bandit for Cross-Domain Recommendation

摘要

Traditional recommendation systems (RecSys) suffer from two problems: the exploitation-exploration dilemma and the cold-start problem. One solution to solving the exploitation-exploration dilemma is the contextual bandit policy, which adaptively exploits and explores user interests. As a result, the contextual bandit policy achieves increased rewards in the long run. The contextual bandit policy, however, may cause the system to explore more than needed in the cold-start situations, which can lead to worse short-term rewards. Cross-domain RecSys methods adopt transfer learning to leverage prior knowledge in a source RecSys domain to jump start the cold-start target RecSys. To solve the two problems together, in this paper, we propose the first applicable transferable contextual bandit (TCB) policy for the cross-domain recommendation. TCB not only benefits the exploitation but also accelerates the exploration in the target RecSys. TCB's exploration, in turn, helps to learn how to transfer between different domains. TCB is a general algorithm for both homogeneous and heterogeneous domains. We perform both theoretical regret analysis and empirical experiments. The empirical results show that TCB outperforms the state-of-the-art algorithms over time.

关键词

Transfer Learning; Multi-Armed Bandit; Recommender System; Reinforcement Learning

全文

PDF


Riemannian Stein Variational Gradient Descent for Bayesian Inference

摘要

We develop Riemannian Stein Variational Gradient Descent (RSVGD), a Bayesian inference method that generalizes Stein Variational Gradient Descent (SVGD) to Riemann manifold. The benefits are two-folds: (i) for inference tasks in Euclidean spaces, RSVGD has the advantage over SVGD of utilizing information geometry, and (ii) for inference tasks on Riemann manifolds, RSVGD brings the unique advantages of SVGD to the Riemannian world. To appropriately transfer to Riemann manifolds, we conceive novel and non-trivial techniques for RSVGD, which are required by the intrinsically different characteristics of general Riemann manifolds from Euclidean spaces. We also discover Riemannian Stein's Identity and Riemannian Kernelized Stein Discrepancy. Experimental results show the advantages over SVGD of exploring distribution geometry and the advantages of particle-efficiency, iteration-effectiveness and approximation flexibility over other inference methods on Riemann manifolds.

关键词

Bayesian inference; Riemann manifold; Information geometry; kernel methods

全文

PDF


Dual Set Multi-Label Learning

摘要

In this paper, we propose a new learning framework named dual set multi-label learning, where there are two sets of labels, and an object has one and only one positive label in each set. Compared to general multi-label learning, the exclusive relationship among labels within the same set, and the pairwise inter-set label relationship are much more explicit and more likely to be fully exploited. To handle such kind of problems, a novel boosting style algorithm with model-reuse and distribution adjusting mechanisms is proposed to make the two label sets help each other. In addition, theoretical analyses are presented to show the superiority of learning from dual label sets to learning directly from all labels. To empirically evaluate the performance of our approach, we conduct experiments on two manually collected real-world datasets along with an adapted dataset. Experimental results validate the effectiveness of our approach for dual set multi-label learning.

关键词

Machine Learning; Classification; Multi-Label Learning

全文

PDF


Information Directed Sampling for Stochastic Bandits With Graph Feedback

摘要

We consider stochastic multi-armed bandit problems with graph feedback, where the decision maker is allowed to observe the neighboring actions of the chosen action. We allow the graph structure to vary with time and consider both deterministic and Erdos-Renyi random graph models. For such a graph feedback model, we first present a novel analysis of Thompson sampling that leads to tighter performance bound than existing work. Next, we propose new Information Directed Sampling based policies that are graph-aware in their decision making. Under the deterministic graph case, we establish a Bayesian regret bound for the proposed policies that scales with the clique cover number of the graph instead of the number of actions. Under the random graph case, we provide a Bayesian regret bound for the proposed policies that scales with the ratio of the number of actions over the expected number of observations per iteration. To the best of our knowledge, this is the first analytical result for stochastic bandits with random graph feedback. Finally, using numerical evaluations, we demonstrate that our proposed IDS policies outperform existing approaches, including adaptions of upper confidence bound, epsilon-greedy and Exp3 algorithms.

关键词

Information directed sampling; Stochastic multi-armed bandits; Graph feedback; Online learning

全文

PDF


A Change-Detection Based Framework for Piecewise-Stationary Multi-Armed Bandit Problem

摘要

The multi-armed bandit problem has been extensively studied under the stationary assumption. However in reality, this assumption often does not hold because the distributions of rewards themselves may change over time. In this paper, we propose a change-detection (CD) based framework for multi-armed bandit problems under the piecewise-stationary setting, and study a class of change-detection based UCB (Upper Confidence Bound) policies, CD-UCB, that actively detects change points and restarts the UCB indices. We then develop CUSUM-UCB and PHT-UCB, that belong to the CD-UCB class and use cumulative sum (CUSUM) and Page-Hinkley Test (PHT) to detect changes. We show that CUSUM-UCB obtains the best known regret upper bound under mild assumptions. We also demonstrate the regret reduction of the CD-UCB policies over arbitrary Bernoulli rewards and Yahoo! datasets of webpage click-through rates.

关键词

Multi-armed bandits; Online learning; Change detection

全文

PDF


Nonlinear Pairwise Layer and Its Training for Kernel Learning

摘要

Kernel learning is a fundamental technique that has been intensively studied in the past decades. For the complicated practical tasks, the traditional "shallow" kernels (e.g., Gaussian kernel and sigmoid kernel) are not flexible enough to produce satisfactory performance. To address this shortcoming, this paper introduces a nonlinear layer in kernel learning to enhance the model flexibility. This layer is pairwise, which fully considers the coupling information among examples. So our model contains a fixed single mapping layer (i.e. a Gaussian kernel) as well as a nonlinear pairwise layer, thereby achieving better flexibility than the existing kernel structures. Moreover, the proposed structure can be seamlessly embedded to Support Vector Machines (SVM), of which the training process can be formulated as a joint optimization problem including nonlinear function learning and standard SVM optimization. We theoretically prove that the objective function is gradient-Lipschitz continuous, which further guides us how to accelerate the optimization process in a deep kernel architecture. Experimentally, we find that the proposed structure outperforms other state-ofthe-art kernel-based algorithms on various benchmark datasets, and thus the effectiveness of the incorporated pairwise layer with its training approach is demonstrated.

关键词

全文

PDF


A Batch Learning Framework for Scalable Personalized Ranking

摘要

In designing personalized ranking algorithms, it is desirable to encourage a high precision at the top of the ranked list. Existing methods either seek a smooth convex surrogate for a non-smooth ranking metric or directly modify updating procedures to encourage top accuracy. In this work we point out that these methods do not scale well in a large-scale setting, and this is partly due to the inaccurate pointwise or pairwise rank estimation. We propose a new framework for personalized ranking. It uses batch-based rank estimators and smooth rank-sensitive loss functions. This new batch learning framework leads to more stable and accurate rank approximations compared to previous work. Moreover, it enables explicit use of parallel computation to speed up training. We conduct empirical evaluations on three item recommendation tasks, and our method shows a consistent accuracy improvement over current state-of-the-art methods. Additionally, we observe time efficiency advantages when data scale increases.

关键词

learning to rank; batch; recommendation; scalable

全文

PDF


Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-Offs by Selective Execution

摘要

We introduce Dynamic Deep Neural Networks (D2NN), a new type of feed-forward deep neural network that allows selective execution. Given an input, only a subset of D2NN neurons are executed, and the particular subset is determined by the D2NN itself. By pruning unnecessary computation depending on input, D2NNs provide a way to improve computational efficiency. To achieve dynamic selective execution, a D2NN augments a feed-forward deep neural network (directed acyclic graph of differentiable modules) with controller modules. Each controller module is a sub-network whose output is a decision that controls whether other modules can execute. A D2NN is trained end to end. Both regular and controller modules in a D2NN are learnable and are jointly trained to optimize both accuracy and efficiency. Such training is achieved by integrating backpropagation with reinforcement learning. With extensive experiments of various D2NN architectures on image classification tasks, we demonstrate that D2NNs are general and flexible, and can effectively optimize accuracy-efficiency trade-offs.

关键词

全文

PDF


Doubly Approximate Nearest Neighbor Classification

摘要

Nonparametric classification models, such as K-Nearest Neighbor (KNN), have become particularly powerful tools in machine learning and data mining, due to their simplicity and flexibility. However, the testing time of the KNN classifier becomes unacceptable and the KNN's performance deteriorates significantly when applied to data sets with millions of dimensions. We observe that state-of-the-art approximate nearest neighbor (ANN) methods aim to either reduce the number of distance comparisons based on tree structure or decrease the cost of distance computation by dimension reduction methods. In this paper, we propose a doubly approximate nearest neighbor classification strategy, which marries the two branches which compress the dimensions for decreasing distance computation cost as well as reduce the number of distance comparison instead of full scan. Under this strategy, we build a compressed dimensional tree (CD-Tree) to avoid unnecessary distance calculations. In each decision node, we propose a novel feature selection paradigm by optimizing the feature selection vector as well as the separator (indicator variables for splitting instances) with the maximum margin. An efficient algorithm is then developed to find the globally optimal solution with convergence guarantee. Furthermore, we also provide a data-dependent generalization error bound for our model, which reveals a new insight for the design of ANN classification algorithms. Our empirical studies show that our algorithm consistently obtains competitive or better classification results on all data sets, yet we can also achieve three orders of magnitude faster than state-of-the-art libraries on very high dimensions.

关键词

K Nearest Neighbors(KNN); Large scale high dimensions; Tree; Maximum margin

全文

PDF


Euler Sparse Representation for Image Classification

摘要

Sparse representation based classification (SRC) has gained great success in image recognition. Motivated by the fact that kernel trick can capture the nonlinear similarity of features, which may help improve the separability and margin between nearby data points, we propose Euler SRC for image classification, which is essentially the SRC with Euler sparse representation. To be specific, it first maps the images into the complex space by Euler representation, which has a negligible effect for outliers and illumination, and then performs complex SRC with Euler representation. The major advantage of our method is that Euler representation is explicit with no increase of the image space dimensionality, thereby enabling this technique to be easily deployed in real applications. To solve Euler SRC, we present an efficient algorithm, which is fast and has good convergence. Extensive experimental results illustrate that Euler SRC outperforms traditional SRC and achieves better performance for image classification.

关键词

全文

PDF


Variational Probability Flow for Biologically Plausible Training of Deep Neural Networks

摘要

The quest for biologically plausible deep learning is driven, not just by the desire to explain experimentally-observed properties of biological neural networks, but also by the hope of discovering more efficient methods for training artificial networks. In this paper, we propose a new algorithm named Variational Probably Flow (VPF), an extension of minimum probability flow for training binary Deep Boltzmann Machines (DBMs). We show that weight updates in VPF are local, depending only on the states and firing rates of the adjacent neurons. Unlike contrastive divergence, there is no need for Gibbs confabulations; and unlike backpropagation, alternating feedforward and feedback phases are not required. Moreover, the learning algorithm is effective for training DBMs with intra-layer connections between the hidden nodes. Experiments with MNIST and Fashion MNIST demonstrate that VPF learns reasonable features quickly, reconstructs corrupted images more accurately, and generates samples with a high estimated log-likelihood. Lastly, we note that, interestingly, if an asymmetric version of VPF exists, the weight updates directly explain experimental results in Spike-Timing-Dependent Plasticity (STDP).

关键词

Biologically plausible learning, deep learning, Spike-timing-dependent Plasticity

全文

PDF


A Parallelizable Acceleration Framework for Packing Linear Programs

摘要

This paper presents an acceleration framework for packing linear programming problems where the amount of data available is limited, i.e., where the number of constraints m is small compared to the variable dimension n. The framework can be used as a black box to speed up linear programming solvers dramatically, by two orders of magnitude in our experiments. We present worst-case guarantees on the quality of the solution and the speedup provided by the algorithm, showing that the framework provides an approximately optimal solution while running the original solver on a much smaller problem. The framework can be used to accelerate exact solvers, approximate solvers, and parallel/distributed solvers. Further, it can be used for both linear programs and integer linear programs.

关键词

optimization; linear programs; parallel algorithms

全文

PDF


Nonconvex Sparse Spectral Clustering by Alternating Direction Method of Multipliers and Its Convergence Analysis

摘要

Spectral Clustering (SC) is a widely used data clustering method which first learns a low-dimensional embedding U of data by computing the eigenvectors of the normalized Laplacian matrix, and then performs k-means on UT to get the final clustering result. The Sparse Spectral Clustering (SSC) method extends SC with a sparse regularization on UUT by using the block diagonal structure prior of UUT in the ideal case. However, encouraging UUT to be sparse leads to a heavily nonconvex problem which is challenging to solve and the work (Lu, Yan, and Lin 2016) proposes a convex relaxation in the pursuit of this aim indirectly. However, the convex relaxation generally leads to a loose approximation and the quality of the solution is not clear. This work instead considers to solve the nonconvex formulation of SSC which directly encourages UUT to be sparse. We propose an efficient Alternating Direction Method of Multipliers (ADMM) to solve the nonconvex SSC and provide the convergence guarantee. In particular, we prove that the sequences generated by ADMM always exist a limit point and any limit point is a stationary point. Our analysis does not impose any assumptions on the iterates and thus is practical. Our proposed ADMM for nonconvex problems allows the stepsize to be increasing but upper bounded, and this makes it very efficient in practice. Experimental analysis on several real data sets verifies the effectiveness of our method.

关键词

sparse spectral clustering; nonconvex ADMM; convergence analysis

全文

PDF


Matrix Variate Gaussian Mixture Distribution Steered Robust Metric Learning

摘要

Mahalanobis Metric Learning (MML) has been actively studied recently in machine learning community. Most of existing MML methods aim to learn a powerful Mahalanobis distance for computing similarity of two objects. More recently, multiple methods use matrix norm regularizers to constrain the learned distance matrixMto improve the performance. However, in real applications, the structure of the distance matrix M is complicated and cannot be characterized well by the simple matrix norm. In this paper, we propose a novel robust metric learning method with learning the structure of the distance matrix in a new and natural way. We partition M into blocks and consider each block as a random matrix variate, which is fitted by matrix variate Gaussian mixture distribution. Different from existing methods, our model has no any assumption on M and automatically learns the structure of M from the real data, where the distance matrix M often is neither sparse nor low-rank. We design an effective algorithm to optimize the proposed model and establish the corresponding theoretical guarantee. We conduct extensive evaluations on the real-world data. Experimental results show our method consistently outperforms the related state-of-the-art methods.

关键词

Robust Metric Learning; Gaussian Mixture Distribution

全文

PDF


Consistent and Specific Multi-View Subspace Clustering

摘要

Multi-view clustering has attracted intensive attention due to the effectiveness of exploiting multiple views of data. However, most existing multi-view clustering methods only aim to explore the consistency or enhance the diversity of different views. In this paper, we propose a novel multi-view subspace clustering method (CSMSC), where consistency and specificity are jointly exploited for subspace representation learning. We formulate the multi-view self-representation property using a shared consistent representation and a set of specific representations, which better fits the real-world datasets. Specifically, consistency models the common properties among all views, while specificity captures the inherent difference in each view. In addition, to optimize the non-convex problem, we introduce a convex relaxation and develop an alternating optimization algorithm to recover the corresponding data representations. Experimental evaluations on four benchmark datasets demonstrate that the proposed approach achieves better performance over several state-of-the-arts.

关键词

Multi-view learning; Subspace clustering

全文

PDF


Stochastic Non-Convex Ordinal Embedding With Stabilized Barzilai-Borwein Step Size

摘要

Learning representation from relative similarity comparisons, often called ordinal embedding, gains rising attention in recent years. Most of the existing methods are batch methods designed mainly based on the convex optimization, say, the projected gradient descent method. However, they are generally time-consuming due to that the singular value decomposition (SVD) is commonly adopted during the update, especially when the data size is very large. To overcome this challenge, we propose a stochastic algorithm called SVRG-SBB, which has the following features: (a) SVD-free via dropping convexity, with good scalability by the use of stochastic algorithm, i.e., stochastic variance reduced gradient (SVRG), and (b) adaptive step size choice via introducing a new stabilized Barzilai-Borwein (SBB) method as the original version for convex problems might fail for the considered stochastic non-convex optimization problem. Moreover, we show that the proposed algorithm converges to a stationary point at a rate O(1/T) in our setting, where T is the number of total iterations. Numerous simulations and real-world data experiments are conducted to show the effectiveness of the proposed algorithm via comparing with the state-of-the-art methods, particularly, much lower computational cost with good prediction performance.

关键词

Stochastic Optimization; Non-convex Optimization; Barzilai-Borwein Step Size

全文

PDF


MDP-Based Cost Sensitive Classification Using Decision Trees

摘要

In classification, an algorithm learns to classify a given instance based on a set of observed attribute values. In many real world cases testing the value of an attribute incurs a cost. Furthermore, there can also be a cost associated with the misclassification of an instance. Cost sensitive classification attempts to minimize the expected cost of classification, by deciding after each observed attribute value, which attribute to measure next. In this paper we suggest Markov Decision Processes as a modeling tool for cost sensitive classification. We construct standard decision trees over all attribute subsets, and the leaves of these trees become the state space of our MDP. At each phase we decide on the next attribute to measure, balancing the cost of the measurement and the classification accuracy. We compare our approach to a set of previous approaches, showing our approach to work better for a range of misclassification costs.

关键词

Classification, Cost-sensitive, Misclassification costs, Test costs, MDP

全文

PDF


Data-Dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion

摘要

Embedding-based methods for knowledge base completion (KBC) learn representations of entities and relations in a vector space, along with the scoring function to estimate the likelihood of relations between entities. The learnable class of scoring functions is designed to be expressive enough to cover a variety of real-world relations, but this expressive comes at the cost of an increased number of parameters. In particular, parameters in these methods are superfluous for relations that are either symmetric or antisymmetric. To mitigate this problem, we propose a new L1 regularizer for Complex Embeddings, which is one of the state-of-the-art embedding-based methods for KBC. This regularizer promotes symmetry or antisymmetry of the scoring function on a relation-by-relation basis, in accordance with the observed data. Our empirical evaluation shows that the proposed method outperforms the original Complex Embeddings and other baseline methods on the FB15k dataset.

关键词

knowledge base completion

全文

PDF


Belief Reward Shaping in Reinforcement Learning

摘要

A key challenge in many reinforcement learning problems is delayed rewards, which can significantly slow down learning. Although reward shaping has previously been introduced to accelerate learning by bootstrapping an agent with additional information, this can lead to problems with convergence. We present a novel Bayesian reward shaping framework that augments the reward distribution with prior beliefs that decay with experience. Formally, we prove that under suitable conditions a Markov decision process augmented with our framework is consistent with the optimal policy of the original MDP when using the Q-learning algorithm. However, in general our method integrates seamlessly with any reinforcement learning algorithm that learns a value or action-value function through experience. Experiments are run on a gridworld and a more complex backgammon domain that show that we can learn tasks significantly faster when we specify intuitive priors on the reward distribution.

关键词

reward shaping; Bayesian statistics

全文

PDF


Learning Multi-Way Relations via Tensor Decomposition With Neural Networks

摘要

How can we classify multi-way data such as network traffic logs with multi-way relations between source IPs, destination IPs, and ports? Multi-way data can be represented as a tensor, and there have been several studies on classification of tensors to date. One critical issue in the classification of multi-way relations is how to extract important features for classification when objects in different multi-way data, i.e., in different tensors, are not necessarily in correspondence. In such situations, we aim to extract features that do not depend on how we allocate indices to an object such as a specific source IP; we are interested in only the structures of the multi-way relations. However, this issue has not been considered in previous studies on classification of multi-way data. We propose a novel method which can learn and classify multi-way data using neural networks. Our method leverages a novel type of tensor decomposition that utilizes a target core tensor expressing the important features whose indices are independent of those of the multi-way data. The target core tensor guides the tensor decomposition into more effective results and is optimized in a supervised manner. Our experiments on three different domains show that our method is highly accurate, especially on higher order data. It also enables us to interpret the classification results along with the matrices calculated with the novel tensor decomposition.

关键词

Tensor Decomposition; Neural Network; Interpretability

全文

PDF


Subgraph Pattern Neural Networks for High-Order Graph Evolution Prediction

摘要

In this work we generalize traditional node/link prediction tasks in dynamic heterogeneous networks, to consider joint prediction over larger k-node induced subgraphs. Our key insight is to incorporate the unavoidable dependencies in the training observations of induced subgraphs into both the input features and the model architecture itself via high-order dependencies. The strength of the representation is its invariance to isomorphisms and varying local neighborhood sizes, while still being able to take node/edge labels into account, and facilitating inductive reasoning (i.e., generalization to unseen portions of the network). Empirical results show that our proposed method significantly outperforms other state-of-the-art methods designed for static and/or single node/link prediction tasks. In addition, we show that our method is scalable and learns interpretable parameters.

关键词

ML: Relational/Graph-Based Learning; ML: Structured Prediction; ML: Deep Learning/Neural Networks; APP: Social Networks

全文

PDF


Exploiting Emotion on Reviews for Recommender Systems

摘要

Review history is widely used by recommender systems to infer users' preferences and help find the potential interests from the huge volumes of data, whereas it also brings in great concerns on the sparsity and cold-start problems due to its inadequacy. Psychology and sociology research has shown that emotion information is a strong indicator for users' preferences. Meanwhile, with the fast development of online services, users are willing to express their emotion on others' reviews, which makes the emotion information pervasively available. Besides, recent research shows that the number of emotion on reviews is always much larger than the number of reviews. Therefore incorporating emotion on reviews may help to alleviate the data sparsity and cold-start problems for recommender systems. In this paper, we provide a principled and mathematical way to exploit both positive and negative emotion on reviews, and propose a novel framework MIRROR, exploiting eMotIon on Reviews for RecOmmendeR systems from both global and local perspectives. Empirical results on real-world datasets demonstrate the effectiveness of our proposed framework and further experiments are conducted to understand how emotion on reviews works for the proposed framework.

关键词

Recommendation; Emotion; Cold-start

全文

PDF


Personalized Privacy-Preserving Social Recommendation

摘要

Privacy leakage is an important issue for social recommendation. Existing privacy preserving social recommendation approaches usually allow the recommender to fully control users' information. This may be problematic since the recommender itself may be untrusted, leading to serious privacy leakage. Besides, building social relationships requires sharing interests as well as other private information, which may lead to more privacy leakage. Although sometimes users are allowed to hide their sensitive private data using privacy settings, the data being shared can still be abused by the adversaries to infer sensitive private information. Supporting social recommendation with least privacy leakage to untrusted recommender and other users (i.e., friends) is an important yet challenging problem. In this paper, we aim to address the problem of achieving privacy-preserving social recommendation under personalized privacy settings. We propose PrivSR, a novel framework for privacy-preserving social recommendation, in which users can model ratings and social relationships privately. Meanwhile, by allocating different noise magnitudes to personalized sensitive and non-sensitive ratings, we can protect users' privacy against the untrusted recommender and friends. Theoretical analysis and experimental evaluation on real-world datasets demonstrate that our framework can protect users' privacy while being able to retain effectiveness of the underlying recommender system.

关键词

Recommendation; Social relationships; Privacy protection

全文

PDF


Proper Loss Functions for Nonlinear Hawkes Processes

摘要

Temporal point processes are a statistical framework for modelling the times at which events of interest occur. The Hawkes process is a well-studied instance of this framework that captures self-exciting behaviour, wherein the occurrence of one event increases the likelihood of future events. Such processes have been successfully applied to model phenomena ranging from earthquakes to behaviour in a social network. We propose a framework to design new loss functions to train linear and nonlinear Hawkes processes. This captures standard maximum likelihood as a special case, but allows for other losses that guarantee convex objective functions (for certain types of kernel), and admit simpler optimisation. We illustrate these points with three concrete examples: for linear Hawkes processes, we provide a least-squares style loss potentially admitting closed-form optimisation; for exponential Hawkes processes, we reduce training to a weighted logistic regression; and for sigmoidal Hawkes processes, we propose an asymmetric form of logistic regression.

关键词

全文

PDF


Bernoulli Embeddings for Graphs

摘要

Just as semantic hashing can accelerate information retrieval, binary valued embeddings can significantly reduce latency in the retrieval of graphical data. We introduce a simple but effective model for learning such binary vectors for nodes in a graph. By imagining the embeddings as independent coin flips of varying bias, continuous optimization techniques can be applied to the approximate expected loss. Embeddings optimized in this fashion consistently outperform the quantization of both spectral graph embeddings and various learned real-valued embeddings, on both ranking and pre-ranking tasks for a variety of datasets.

关键词

全文

PDF


Core Dependency Networks

摘要

Many applications infer the structure of a probabilistic graphical model from data to elucidate the relationships between variables. But how can we train graphical models on a massive data set? In this paper, we show how to construct coresets---compressed data sets which can be used as proxy for the original data and have provably bounded worst case error---for Gaussian dependency networks (DNs), i.e., cyclic directed graphical models over Gaussians, where the parents of each variable are its Markov blanket. Specifically, we prove that Gaussian DNs admit coresets of size independent of the size of the data set. Unfortunately, this does not extend to DNs over members of the exponential family in general. As we will prove, Poisson DNs do not admit small coresets. Despite this worst-case result, we will provide an argument why our coreset construction for DNs can still work well in practice on count data.To corroborate our theoretical results, we empirically evaluated the resulting Core DNs on real data sets. The results demonstrate significant gains over no or naive sub-sampling, even in the case of count data.

关键词

coresets, dependency networks, graphical models, structure discovery, Gaussian, Poisson, big data

全文

PDF


Mixed Sum-Product Networks: A Deep Architecture for Hybrid Domains

摘要

While all kinds of mixed data---from personal data, over panel and scientific data, to public and commercial data---are collected and stored, building probabilistic graphical models for these hybrid domains becomes more difficult. Users spend significant amounts of time in identifying the parametric form of the random variables (Gaussian, Poisson, Logit, etc.) involved and learning the mixed models. To make this difficult task easier, we propose the first trainable probabilistic deep architecture for hybrid domains that features tractable queries. It is based on Sum-Product Networks (SPNs) with piecewise polynomial leaf distributions together with novel nonparametric decomposition and conditioning steps using the Hirschfeld-Gebelein-Renyi Maximum Correlation Coefficient. This relieves the user from deciding a-priori the parametric form of the random variables but is still expressive enough to effectively approximate any distribution and permits efficient learning and inference.Our experiments show that the architecture, called Mixed SPNs, can indeed capture complex distributions across a wide range of hybrid domains.

关键词

sum-product networks, non-parametric density estimation, non-parameteric independency test, hybrid domains, mixed graphical models

全文

PDF


Alternating Circulant Random Features for Semigroup Kernels

摘要

The random features method is an efficient method to approximate the kernel function. In this paper, we propose novel random features called "alternating circulant random features,'' which consist of a random mixture of independent random structured matrices. Existing fast random features exploit random sign flipping to reduce the correlation between features. Sign flipping works well on random Fourier features for real-valued shift-invariant kernels because the corresponding weight distribution is symmetric. However, this method cannot be applied to random Laplace features directly because the distribution is not symmetric. The method proposed herein yields alternating circulant random features, with the correlation between features being reduced through the random sampling of weights from multiple independent random structured matrices instead of via random sign flipping. The proposed method facilitates rapid calculation by employing structured matrices. In addition, the weight distribution is preserved because sign flipping is not implemented. The performance of the proposed alternating circulant random features method is theoretically and empirically evaluated.

关键词

全文

PDF


Overlap-Robust Decision Boundary Learning for Within-Network Classification

摘要

We study the problem of within network classification, where given a partially labeled network, we infer the labels of the remaining nodes based on the link structure. Conventional loss functions penalize a node based on a function of its predicted label and target label. Such loss functions under-perform while learning on a network having overlapping classes. In relational setting, even though the ground truth is not known for the unlabeled nodes, some evidence is present in the form of labeling acquired by the nodes in their neighborhood. We propose a structural loss function for learning in networks based on the hypothesis that loss is induced when a node fails to acquire a label that is consistent with the labels of the majority of the nodes in its neighborhood. We further combine this with a novel semantic regularizer, which we call homophily regularizer, to capture the smooth transition of discriminatory power and behavior of semantically similar nodes. The proposed structural loss along with the regularizer permits relaxation labeling. Through extensive comparative study on different real-world datasets, we found that our method improves over the state-of-the-art approaches.

关键词

Semi-Supervised Learning in Graphs; Collective Classification; Structural Loss; Homophily Regularization

全文

PDF


A Provable Approach for Double-Sparse Coding

摘要

Sparse coding is a crucial subroutine in algorithms for various signal processing, deep learning, and other machine learning applications. The central goal is to learn an overcomplete dictionary that can sparsely represent a given dataset. However, storage, transmission, and processing of the learned dictionary can be untenably high if the data dimension is high. In this paper, we consider the double-sparsity model introduced by Rubinstein, Zibulevsky, and Elad (2010) where the dictionary itself is the product of a fixed, known basis and a data-adaptive sparse component. First, we introduce a simple algorithm for double-sparse coding that can be amenable to efficient implementation via neural architectures. Second, we theoretically analyze its performance and demonstrate asymptotic sample complexity and running time benefits over existing (provable) approaches for sparse coding. To our knowledge, our work introduces the first computationally efficient algorithm for double-sparse coding that enjoys rigorous statistical guarantees. Finally, we support our analysis via several numerical experiments on simulated data, confirming that our method can indeed be useful in problem sizes encountered in practical applications.

关键词

Sparse Coding; Dictionary Learning; Double-Sparse Coding; Feature Construction,

全文

PDF


Hierarchical Policy Search via Return-Weighted Density Estimation

摘要

Learning an optimal policy from a multi-modal reward function is a challenging problem in reinforcement learning (RL). Hierarchical RL (HRL) tackles this problem by learning a hierarchicalpolicy, where multiple option policies are in charge of different strategies corresponding to modes of a reward function and a gating policy selects the best option for a given context. Although HRL has been demonstrated to be promising, current state-of-the-art methods cannot still perform well in complex real-world problems due to the difficulty of identifying modes of the reward function. In this paper, we propose a novel method called hierarchical policy search via return-weighted density estimation (HPSDE), which can efficiently identify the modes through density estimation with return-weighted importance sampling. Our proposed method finds option policies corresponding to the modes of the return function and automatically determines the number and the location of option policies, which significantly reduces the burden of hyper-parameters tuning. Through experiments, we demonstrate that the proposed HPSDE successfully learns option policies corresponding to modes of the return function and that it can be successfully applied to a motion planning problem of a redundant robotic manipulator.

关键词

Reinforcement learning;

全文

PDF


Dynamic Determinantal Point Processes

摘要

The determinantal point process (DPP) has been receiving increasing attention in machine learning as a generative model of subsets consisting of relevant and diverse items. Recently, there has been a significant progress in developing efficient algorithms for learning the kernel matrix that characterizes a DPP. Here, we propose a dynamic DPP, which is a DPP whose kernel can change over time, and develop efficient learning algorithms for the dynamic DPP. In the dynamic DPP, the kernel depends on the subsets selected in the past, but we assume a particular structure in the dependency to allow efficient learning. We also assume that the kernel has a low rank and exploit a recently proposed learning algorithm for the DPP with low-rank factorization, but also show that its bottleneck computation can be reduced from O(M2 K) time to O(M K2) time, where M is the number of items under consideration, and K is the rank of the kernel, which can be set smaller than M by orders of magnitude.

关键词

Determinantal point process; Time series; Learning

全文

PDF


Gaussian Process Decentralized Data Fusion Meets Transfer Learning in Large-Scale Distributed Cooperative Perception

摘要

This paper presents novel Gaussian process decentralized data fusion algorithms exploiting the notion of agent-centric support sets for distributed cooperative perception of large-scale environmental phenomena. To overcome the limitations of scale in existing works, our proposed algorithms allow every mobile sensing agent to choose a different support set and dynamically switch to another during execution for encapsulating its own data into a local summary that, perhaps surprisingly, can still be assimilated with the other agents' local summaries (i.e., based on their current choices of support sets) into a globally consistent summary to be used for predicting the phenomenon. To achieve this, we propose a novel transfer learning mechanismfor a team of agents capable of sharing and transferring information encapsulated in a summary based on a support set to that utilizing a different support set with some loss that can be theoretically bounded and analyzed. To alleviate the issue of information loss accumulating over multiple instances of transfer learning, we propose a new information sharing mechanism to be incorporated into our algorithms in order to achieve memory-efficient lazy transfer learning. Empirical evaluation on real-world datasets show that our algorithms outperform the state-of-the-art methods.

关键词

全文

PDF


Training CNNs With Normalized Kernels

摘要

Several methods of normalizing convolution kernels have been proposed in the literature to train convolutional neural networks (CNNs), and have shown some success. However, our understanding of these methods has lagged behind their success in application; there are a lot of open questions, such as why a certain type of kernel normalization is effective and what type of normalization should be employed for each (e.g., higher or lower) layer of a CNN. As the first step towards answering these questions, we propose a framework that enables us to use a variety of kernel normalization methods at any layer of a CNN. A naive integration of kernel normalization with a general optimization method, such as SGD, often entails instability while updating parameters. Thus, existing methods employ ad-hoc procedures to empirically assure convergence. In this study, we pose estimation of convolution kernels under normalization constraints as constraint-free optimization on kernel submanifolds that are identified by the employed constraints. Note that naive application of the established optimization methods for matrix manifolds to the aforementioned problems is not feasible because of the hierarchical nature of CNNs. To this end, we propose an algorithm for optimization on kernel manifolds in CNNs by appropriate scaling of the space of kernels based on structure of CNNs and statistics of data. We theoretically prove that the proposed algorithm has assurance of almost sure convergence to a solution at single minimum. Our experimental results show that the proposed method can successfully train popular CNN models using several different types of kernel normalization methods. Moreover, they show that the proposed method improves classification performance of baseline CNNs, and provides state-of-the-art performance for major image classification benchmarks.

关键词

Deep Learning/Neural Networks; Classification; Machine Learning (General/other); Learning Theory; Statistical Methods and Learning

全文

PDF


Sparse Modeling-Based Sequential Ensemble Learning for Effective Outlier Detection in High-Dimensional Numeric Data

摘要

The large proportion of irrelevant or noisy features in real-life high-dimensional data presents a significant challenge to subspace/feature selection-based high-dimensional outlier detection (a.k.a. outlier scoring) methods. These methods often perform the two dependent tasks: relevant feature subset search and outlier scoring independently, consequently retaining features/subspaces irrelevant to the scoring method and downgrading the detection performance. This paper introduces a novel sequential ensemble-based framework SEMSE and its instance CINFO to address this issue. SEMSE learns the sequential ensembles to mutually refine feature selection and outlier scoring by iterative sparse modeling with outlier scores as the pseudo target feature. CINFO instantiates SEMSE by using three successive recurrent components to build such sequential ensembles. Given outlier scores output by an existing outlier scoring method on a feature subset, CINFO first defines a Cantelli's inequality-based outlier thresholding function to select outlier candidates with a false positive upper bound. It then performs lasso-based sparse regression by treating the outlier scores as the target feature and the original features as predictors on the outlier candidate set to obtain a feature subset that is tailored for the outlier scoring method. Our experiments show that two different outlier scoring methods enabled by CINFO (i) perform significantly better on 11 real-life high-dimensional data sets, and (ii) have much better resilience to noisy features, compared to their bare versions and three state-of-the-art competitors. The source code of CINFO is available at https://sites.google.com/site/gspangsite/sourcecode.

关键词

Outlier Detection; Outlier Ensemble; Feature Selection; Sparse Modeling; Sequential Ensemble

全文

PDF


SAGA: A Submodular Greedy Algorithm for Group Recommendation

摘要

In this paper, we propose a unified framework and an algorithm for the problem of group recommendation where a fixed number of items or alternatives can be recommended to a group of users. The problem of group recommendation arises naturally in many real world contexts, and is closely related to the budgeted social choice problem studied in economics. We frame the group recommendation problem as choosing a subgraph with the largest group consensus score in a completely connected graph defined over the item affinity matrix. We propose a fast greedy algorithm with strong theoretical guarantees, and show that the proposed algorithm compares favorably to the state-of-the-art group recommendation algorithms according to commonly used relevance and coverage performance measures on benchmark dataset.

关键词

group recommendation; submodular function optimization; budgeted social choice

全文

PDF


Quantized Memory-Augmented Neural Networks

摘要

Memory-augmented neural networks (MANNs) refer to a class of neural network models equipped with external memory (such as neural Turing machines and memory networks). These neural networks outperform conventional recurrent neural networks (RNNs) in terms of learning long-term dependency, allowing them to solve intriguing AI tasks that would otherwise be hard to address. This paper concerns the problem of quantizing MANNs. Quantization is known to be effective when we deploy deep models on embedded systems with limited resources. Furthermore, quantization can substantially reduce the energy consumption of the inference procedure. These benefits justify recent developments of quantized multi layer perceptrons, convolutional networks, and RNNs. However, no prior work has reported the successful quantization of MANNs. The in-depth analysis presented here reveals various challenges that do not appear in the quantization of the other networks. Without addressing them properly, quantized MANNs would normally suffer from excessive quantization error which leads to degraded performance. In this paper, we identify memory addressing (specifically, content-based addressing) as the main reason for the performance degradation and propose a robust quantization method for MANNs to address the challenge. In our experiments, we achieved a computation-energy gain of 22× with 8-bit fixed-point and binary quantization compared to the floating-point implementation. Measured on the bAbI dataset, the resulting model, named the quantized MANN (Q-MANN), improved the error rate by 46% and 30% with 8-bit fixed-point and binary quantization, respectively, compared to the MANN quantized using conventional techniques.

关键词

Artificial Intelligence; Deep Learning; MANN; content-based addressing

全文

PDF


Adversarial Dropout for Supervised and Semi-Supervised Learning

摘要

Recently, training with adversarial examples, which are generated by adding a small but worst-case perturbation on input examples, has improved the generalization performance of neural networks. In contrast to the biased individual inputs to enhance the generality, this paper introduces adversarial dropout, which is a minimal set of dropouts that maximize the divergence between 1) the training supervision and 2) the outputs from the network with the dropouts. The identified adversarial dropouts are used to automatically reconfigure the neural network in the training process, and we demonstrated that the simultaneous training on the original and the reconfigured network improves the generalization performance of supervised and semi-supervised learning tasks on MNIST, SVHN, and CIFAR-10. We analyzed the trained model to find the performance improvement reasons. We found that adversarial dropout increases the sparsity of neural networks more than the standard dropout. Finally, we also proved that adversarial dropout is a regularization term with a rank-valued hyper-parameter that is different from a continuous-valued parameter to specify the strength of the regularization.

关键词

adversarial training; regularization; deep learning

全文

PDF


Alternating Optimisation and Quadrature for Robust Control

摘要

Bayesian optimisation has been successfully applied to a variety of reinforcement learning problems. However, the traditional approach for learning optimal policies in simulators does not utilise the opportunity to improve learning by adjusting certain environment variables: state features that are unobservable and randomly determined by the environment in a physical setting but are controllable in a simulator. This paper considers the problem of finding a robust policy while taking into account the impact of environment variables. We present Alternating Optimisation and Quadrature (ALOQ), which uses Bayesian optimisation and Bayesian quadrature to address such settings. ALOQ is robust to the presence of significant rare events, which may not be observable under random sampling, but play a substantial role in determining the optimal policy. Experimental results across different domains show that ALOQ can learn more efficiently and robustly than existing methods.

关键词

Reinforcement Learning; Bayesian Optimisation; Bayesian Quadrature

全文

PDF


Multi-Adversarial Domain Adaptation

摘要

Recent advances in deep domain adaptation reveal that adversarial learning can be embedded into deep networks to learn transferable features that reduce distribution discrepancy between the source and target domains. Existing domain adversarial adaptation methods based on single domain discriminator only align the source and target data distributions without exploiting the complex multimode structures. In this paper, we present a multi-adversarial domain adaptation (MADA) approach, which captures multimode structures to enable fine-grained alignment of different data distributions based on multiple domain discriminators. The adaptation can be achieved by stochastic gradient descent with the gradients computed by back-propagation in linear-time. Empirical evidence demonstrates that the proposed model outperforms state of the art methods on standard domain adaptation datasets.

关键词

Transfer learning; Deep learning

全文

PDF


FiLM: Visual Reasoning with a General Conditioning Layer

摘要

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network computation via a simple, feature-wise affine transformation based on conditioning information. We show that FiLM layers are highly effective for visual reasoning - answering image-related questions which require a multi-step, high-level process - a task which has proven difficult for standard deep learning methods that do not explicitly model reasoning. Specifically, we show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are robust to ablations and architectural modifications, and 4) generalize well to challenging, new data from few examples or even zero-shot.

关键词

Deep Learning; Language and Vision

全文

PDF


Source Traces for Temporal Difference Learning

摘要

This paper motivates and develops source traces for temporal difference (TD) learning in the tabular setting. Source traces are like eligibility traces, but model potential histories rather than immediate ones. This allows TD errors to be propagated to potential causal states and leads to faster generalization. Source traces can be thought of as the model-based, backward view of successor representations (SR), and share many of the same benefits. This view, however, suggests several new ideas. First, a TD(λ)-like source learning algorithm is proposed and its convergence is proven. Then, a novel algorithm for learning the source map (or SR matrix) is developed and shown to outperform the previous algorithm. Finally, various approaches to using the source/SR model are explored, and it is shown that source traces can be effectively combined with other model-based methods like Dyna and experience replay.

关键词

全文

PDF


Randomized Clustered Nystrom for Large-Scale Kernel Machines

摘要

The Nystrom method is a popular technique for generating low-rank approximations of kernel matrices that arise in many machine learning problems. The approximation quality of the Nystrom method depends crucially on the number of selected landmark points and the selection procedure. In this paper, we introduce a randomized algorithm for generating landmark points that is scalable to large high-dimensional data sets. The proposed method performs K-means clustering on low-dimensional random projections of a data set and thus leads to significant savings for high-dimensional data sets. Our theoretical results characterize the tradeoffs between accuracy and efficiency of the proposed method. Moreover, numerical experiments on classification and regression tasks demonstrate the superior performance and efficiency of our proposed method compared with existing approaches.

关键词

kernel methods; low-rank approximation; Nystrom; feature extraction

全文

PDF


Joint Learning of Set Cardinality and State Distribution

摘要

We present a novel approach for learning to predict sets using deep learning. In recent years, deep neural networks have shown remarkable results in computer vision, natural language processing and other related problems. Despite their success,traditional architectures suffer from a serious limitation in that they are built to deal with structured input and output data,i.e. vectors or matrices. Many real-world problems, however, are naturally described as sets, rather than vectors. Existing techniques that allow for sequential data, such as recurrent neural networks, typically heavily depend on the input and output order and do not guarantee a valid solution. Here, we derive in a principled way, a mathematical formulation for set prediction where the output is permutation invariant. In particular, our approach jointly learns both the cardinality and the state distribution of the target set. We demonstrate the validity of our method on the task of multi-label image classification and achieve a new state of the art on the PASCAL VOC and MS COCO datasets.

关键词

全文

PDF


Interpretable Graph-Based Semi-Supervised Learning via Flows

摘要

In this paper, we consider the interpretability of the foundational Laplacian-based semi-supervised learning approaches on graphs. We introduce a novel flow-based learning framework that subsumes the foundational approaches and additionally provides a detailed, transparent, and easily understood expression of the learning process in terms of graph flows. As a result, one can visualize and interactively explore the precise subgraph along which the information from labeled nodes flows to an unlabeled node of interest. Surprisingly, the proposed framework avoids trading accuracy for interpretability, but in fact leads to improved prediction accuracy, which is supported both by theoretical considerations and empirical results. The flow-based framework guarantees the maximum principle by construction and can handle directed graphs in an out-of-the-box manner.

关键词

Semi-Supervised Learning; Trustable and Explainable AI; Graphs; Flows

全文

PDF


Hypergraph p-Laplacian: A Differential Geometry View

摘要

The graph Laplacian plays key roles in information processing of relational data, and has analogies with the Laplacian in differential geometry. In this paper, we generalize the analogy between graph Laplacian and differential geometry to the hypergraph setting, and propose a novel hypergraph p-Laplacian. Unlike the existing two-node graph Laplacians, this generalization makes it possible to analyze hypergraphs, where the edges are allowed to connect any number of nodes. Moreover, we propose a semi-supervised learning method based on the proposed hypergraph p-Laplacian, and formalize them as the analogue to the Dirichlet problem, which often appears in physics. We further explore theoretical connections to normalized hypergraph cut on a hypergraph, and propose normalized cut corresponding to hypergraph p-Laplacian. The proposed p-Laplacian is shown to outperform standard hypergraph Laplacians in the experiment on a hypergraph semi-supervised learning and normalized cut setting.

关键词

全文

PDF


Word Co-Occurrence Regularized Non-Negative Matrix Tri-Factorization for Text Data Co-Clustering

摘要

Text data co-clustering is the process of partitioning the documents and words simultaneously. This approach has proven to be more useful than traditional one-sided clustering when dealing with sparsity. Among the wide range of co-clustering approaches, Non-Negative Matrix Tri-Factorization (NMTF) is recognized for its high performance, flexibility and theoretical foundations. One important aspect when dealing with text data, is to capture the semantic relationships between words since documents that are about the same topic may not necessarily use exactly the same vocabulary. However, this aspect has been overlooked by previous co-clustering models, including NMTF. To address this issue, we rely on the distributional hypothesis stating that words which co-occur frequently within the same context, e.g., a document or sentence, are likely to have similar meanings. We then propose a new NMTF model that maps frequently co-occurring words roughly to the same direction in the latent space to reflect the relationships between them. To infer the factor matrices, we derive a scalable alternating optimization algorithm, whose convergence is guaranteed. Extensive experiments, on several real-world datasets, provide strong evidence for the effectiveness of the proposed approach, in terms of co-clustering.

关键词

Co-Clustering; Non-Negative Matrix Tri-Factorization; Word Co-occurrence; Text Data

全文

PDF


Learning Vector Autoregressive Models With Latent Processes

摘要

We study the problem of learning the support of transition matrix between random processes in a Vector Autoregressive (VAR) model from samples when a subset of the processes are latent. It is well known that ignoring the effect of the latent processes may lead to very different estimates of the influences among observed processes, and we are concerned with identifying the influences among the observed processes, those between the latent ones, and those from the latent to the observed ones. We show that the support of transition matrix among the observed processes and lengths of all latent paths between any two observed processes can be identified successfully under some conditions on the VAR model. From the lengths of latent paths, we reconstruct the latent subgraph (representing the influences among the latent processes) with a minimum number of variables uniquely if its topology is a directed tree. Furthermore, we propose an algorithm that finds all possible minimal latent graphs under some conditions on the lengths of latent paths. Our results apply to both non-Gaussian and Gaussian cases, and experimental results on various synthetic and real-world datasets validate our theoretical results.

关键词

Graphical model learning; Causal structures; VAR models

全文

PDF


Regularizing Deep Networks Using Efficient Layerwise Adversarial Training

摘要

Adversarial training has been shown to regularize deep neural networks in addition to increasing their robustness to adversarial examples. However, the regularization effect on very deep state of the art networks has not been fully investigated. In this paper, we present a novel approach to regularize deep neural networks by perturbing intermediate layer activations in an efficient manner. We use these perturbations to train very deep models such as ResNets and WideResNets and show improvement in performance across datasets of different sizes such as CIFAR-10, CIFAR-100 and ImageNet. Our ablative experiments show that the proposed approach not only provides stronger regularization compared to Dropout but also improves adversarial robustness comparable to traditional adversarial training approaches.

关键词

Deep Learning; Adversarial Training; Regularization; Classification

全文

PDF


From Monte Carlo to Las Vegas: Improving Restricted Boltzmann Machine Training Through Stopping Sets

摘要

We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs). We denote our approach Markov Chain Las Vegas (MCLV). MCLV gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has maximum number of Markov chain steps K (referred as MCLV-K). We present a MCLV-K gradient estimator (LVS-K) for RBMs and explore the correspondence and differences between LVS-K and Contrastive Divergence (CD-K), with LVS-K significantly outperforming CD-K training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models.

关键词

ML: Deep Learning/Neural Networks, ML: Unsupervised Learning

全文

PDF


On Data-Dependent Random Features for Improved Generalization in Supervised Learning

摘要

The randomized-feature approach has been successfully employed in large-scale kernel approximation and supervised learning. The distribution from which the random features are drawn impacts the number of features required to efficiently perform a learning task. Recently, it has been shown that employing data-dependent randomization improves the performance in terms of the required number of random features. In this paper, we are concerned with the randomized-feature approach in supervised learning for good generalizability. We propose the Energy-based Exploration of Random Features (EERF) algorithm based on a data-dependent score function that explores the set of possible features and exploits the promising regions. We prove that the proposed score function with high probability recovers the spectrum of the best fit within the model class. Our empirical results on several benchmark datasets further verify that our method requires smaller number of random features to achieve a certain generalization error compared to the state-of-the-art while introducing negligible pre-processing overhead. EERF can be implemented in a few lines of code and requires no additional tuning parameters.

关键词

Kernel methods; random features

全文

PDF


Labeled Memory Networks for Online Model Adaptation

摘要

Augmenting a neural network with memory that can grow without growing the number of trained parameters is a recent powerful concept with many exciting applications. In this paper, we establish their potential in online adapting a batch trained neural network to domain-relevant labeled data at deployment time. We present the design of Labeled Memory Network (LMN), a new memory augmented neural network (MANN) for fast online model adaptation. We highlight three key features of LMNs. First, LMNs treat memory as a second boosted stage following the trained network thereby allowing the memory and network to play complementary roles. Unlike all existing MANNs that write to memory at every cycle, LMNs provide better memory utilization by writing only labeled data with non-zero loss. Second, LMNs organize the memory with the discrete class label as the primary key unlike existing MANNs where key is a real vector derived from the input. This simple, yet surprisingly unexplored alternative organization, safeguards against catastrophic forgetting of rare labels that current LRU based MANNs are subject to. Finally, LMNs model the evolving expertise of memory and network using a RNN, to determine online their respective weights we evaluate online model adaptation strategies on five sequence prediction tasks, an image classification task, and two language modeling tasks. We show that LMNs are better than other MANNs designed for meta-learning. We also found them to be more accurate and faster than state-of-the-art methods of retuning model parameters for adapting to domain-specific labeled data.

关键词

model adaptation ; plug and play models ; kernel; memory networks; online adaptation

全文

PDF


No Modes Left Behind: Capturing the Data Distribution Effectively Using GANs

摘要

Generative adversarial networks (GANs) while being very versatile in realistic image synthesis, still are sensitive to the input distribution. Given a set of data that has an imbalance in the distribution, the networks are susceptible to missing modes and not capturing the data distribution. While various methods have been tried to improve training of GANs, these have not addressed the challenges of covering the full data distribution. Specifically, a generator is not penalized for missing a mode. We show that these are therefore still susceptible to not capturing the full data distribution. In this paper, we propose a simple approach that combines an encoder based objective with novel loss functions for generator and discriminator that improves the solution in terms of capturing missing modes. We validate that the proposed method results in substantial improvements through its detailed analysis on toy and real datasets. The quantitative and qualitative results demonstrate that the proposed method improves the solution for the problem of missing modes and improves training of GANs.

关键词

generative adversarial networks

全文

PDF


Reduced-Rank Linear Dynamical Systems

摘要

Linear Dynamical Systems are widely used to study the underlying patterns of multivariate time series. A basic assumption of these models is that high-dimensional time series can be characterized by some underlying, low-dimensional and time-varying latent states. However, existing approaches to LDS modeling mostly learn the latent space with a prescribed dimensionality. When dealing with short-length high- dimensional time series data, such models would be easily overfitted. We propose Reduced-Rank Linear Dynamical Systems (RRLDS), to automatically retrieve the intrinsic dimensionality of the latent space during model learning. Our key observation is that the rank of the dynamics matrix of LDS captures the intrinsic dimensionality, and the variational inference with a reduced-rank regularization finally leads to a concise, structured, and interpretable latent space. To enable our method to handle count-valued data, we introduce the dispersion-adaptive distribution to accommodate over-/ equal-/ and under-dispersion nature of such data. Results on both simulated and experimental data demonstrate our model can robustly learn latent space from short-length, noisy, count-valued data and significantly improve the prediction performance over the state-of-the-art methods.

关键词

Dynamical Systems, Count Data, Bayesian Inference, Dimensionality Reduction

全文

PDF


Wasserstein Distance Guided Representation Learning for Domain Adaptation

摘要

Domain adaptation aims at generalizing a high-performance learner on a target domain via utilizing the knowledge distilled from a source domain which has a different but related data distribution. One solution to domain adaptation is to learn domain invariant feature representations while the learned representations should also be discriminative in prediction. To learn such representations, domain adaptation frameworks usually include a domain invariant representation learning approach to measure and reduce the domain discrepancy, as well as a discriminator for classification. Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WDGRL). WDGRL utilizes a neural network, denoted by the domain critic, to estimate empirical Wasserstein distance between the source and target samples and optimizes the feature extractor network to minimize the estimated Wasserstein distance in an adversarial manner. The theoretical advantages of Wasserstein distance for domain adaptation lie in its gradient property and promising generalization bound. Empirical studies on common sentiment and image classification adaptation datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art domain invariant representation learning approaches.

关键词

domain adaptation; wasserstein distance; representation learning

全文

PDF


Compact Multi-Label Learning

摘要

Embedding methods have shown promising performance in multi-label prediction, as they can discover the dependency of labels. Most embedding methods cannot well align the input and output, which leads to degradation in prediction performance. Besides, they suffer from expensive prediction computational costs when applied to large-scale datasets. To address the above issues, this paper proposes a Co-Hashing (CoH) method by formulating multi-label learning from the perspective of cross-view learning. CoH first regards the input and output as two views, and then aims to learn a common latent hamming space, where input and output pairs are compressed into compact binary embeddings. CoH enjoys two key benefits: 1) the input and output can be well aligned, and their correlations are explored; 2) the prediction is very efficient using fast cross-view kNN search in the hamming space. Moreover, we provide the generalization error bound for our method. Extensive experiments on eight real-world datasets demonstrate the superiority of the proposed CoH over the state-of-the-art methods in terms of both prediction accuracy and efficiency.

关键词

全文

PDF


Dynamic Optimization of Neural Network Structures Using Probabilistic Modeling

摘要

Deep neural networks (DNNs) are powerful machine learning models and have succeeded in various artificial intelligence tasks. Although various architectures and modules for the DNNs have been proposed, selecting and designing the appropriate network structure for a target problem is a challenging task. In this paper, we propose a method to simultaneously optimize the network structure and weight parameters during neural network training. We consider a probability distribution that generates network structures, and optimize the parameters of the distribution instead of directly optimizing the network structure. The proposed method can apply to the various network structure optimization problems under the same framework. We apply the proposed method to several structure optimization problems such as selection of layers, selection of unit types, and selection of connections using the MNIST, CIFAR-10, and CIFAR-100 datasets. The experimental results show that the proposed method can find the appropriate and competitive network structures.

关键词

Neural Network; Deep Learning; Information Geometric Optimization

全文

PDF


Learning to Interact With Learning Agents

摘要

AI and machine learning methods are increasingly interacting with and seeking information from people, robots, and other learning agents. Consequently, the learning dynamics of these agents creates fundamentally new challenges for existing methods. Motivated by the application of learning to offer personalized deals to users, we highlight these challenges by studying a variant of the framework of "online learning using expert advice with bandit feedback." In our setting, we consider each expert as a learning agent, seeking to more accurately reflect real-world applications. The bandit feedback leads to additional challenges in this setting: at time t, only the expert it that has been selected by the central algorithm (forecaster) receives feedback from the environment and gets to learn at this time. A natural question to ask is whether it is possible to be competitive with the best expert j* had it seen all the feedback, i.e., competitive with the policy of always selecting expert j*. We prove the following hardness result — without any coordination between the forecaster and the experts, it is impossible to design a forecaster achieving no-regret guarantees. We then consider a practical assumption allowing the forecaster to guide the learning process of the experts by blocking some of the feedback observed by them from the environment, i.e., restricting the selected expert it to learn at time t for some time steps. With this additional coordination power, we design our forecaster LIL that achieves no-regret guarantees, and we provide regret bounds dependent on the learning dynamics of the best expert j*.

关键词

learning agents; online learning; bandit feedback; expert advice; no-regret guarantees

全文

PDF


Attend and Diagnose: Clinical Time Series Analysis Using Attention Models

摘要

With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long Short-Term Memory (LSTM) units, deep neural networks have achieved state-of-the-art results in several clinical prediction tasks. Despite the success of RNN, its sequential nature prohibits parallelized computing, thus making it inefficient particularly when processing long sequences. Recently, architectures which are based solely on attention mechanisms have shown remarkable success in transduction tasks in NLP, while being computationally superior. In this paper, for the first time, we utilize attention models for clinical time-series modeling, thereby dispensing recurrence entirely. We develop the SAnD (Simply Attend and Diagnose) architecture, which employs a masked, self-attention mechanism, and uses positional encoding and dense interpolation strategies for incorporating temporal order. Furthermore, we develop a multi-task variant of SAnD to jointly infer models with multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we demonstrate that the proposed approach achieves state-of-the-art performance in all tasks, outperforming LSTM models and classical baselines with hand-engineered features.

关键词

Attention model; Recurrent Neural Network; Time Series Analysis; Deep Learning; Clinical; Healthcare

全文

PDF


Reinforcement Learning in POMDPs With Memoryless Options and Option-Observation Initiation Sets

摘要

Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make the initiation set of options conditional on the previously-executed option, and show that options with such Option-Observation Initiation Sets (OOIs) are at least as expressive as Finite State Controllers (FSCs), a state-of-the-art approach for learning in POMDPs. OOIs are easy to design based on an intuitive description of the task, lead to explainable policies and keep the top-level and option policies memoryless. Our experiments show that OOIs allow agents to learn optimal policies in challenging POMDPs, while being much more sample-efficient than a recurrent neural network over options.

关键词

Reinforcement Learning; Options; Hierarchical Reinforcement Learning; Partially Observable MDP

全文

PDF


Active Lifelong Learning With "Watchdog"

摘要

Lifelong learning intends to learn new consecutive tasks depending on previously accumulated experiences, i.e., knowledge library. However, the knowledge among different new coming tasks are imbalance. Therefore, in this paper, we try to mimic an effective "human cognition" strategy by actively sorting the importance of new tasks in the process of unknown-to-known and selecting to learn the important tasks with more information preferentially. To achieve this, we consider to assess the importance of the new coming task, i.e., unknown or not, as an outlier detection issue, and design a hierarchical dictionary learning model consisting of two-level task descriptors to sparse reconstruct each task with the l0 norm constraint. The new coming tasks are sorted depending on the sparse reconstruction score in descending order, and the task with high reconstruction score will be permitted to pass, where this mechanism is called as "watchdog." Next, the knowledge library of the lifelong learning framework encode the selected task by transferring previous knowledge, and then can also update itself with knowledge from both previously learned task and current task automatically. For model optimization, the alternating direction method is employed to solve our model and converges to a fixed point. Extensive experiments on both benchmark datasets and our own dataset demonstrate the effectiveness of our proposed model especially in task selection and dictionary learning.

关键词

Transfer, Adaptation, Multitask Learning; Online Learning; Active Learning; Supervised Learning

全文

PDF


Leaf-Smoothed Hierarchical Softmax for Ordinal Prediction

摘要

We propose a new approach to conditional probability estimation for ordinal labels. First, we present a specialized hierarchical softmax variant inspired by k-d trees that leverages the inherent spatial structure of (potentially-multivariate) ordinal labels. We then adapt ideas from signal processing on noisy graphs to develop a novel regularizer for such hierarchical softmax models. Both our tree structure and regularizer independently boost the sample efficiency of a deep learning model across a series of simulation studies. Furthermore, the combination of these two techniques produces additive gains and the model does not suffer from the pathologies of other approaches in the literature. We validate our approach empirically on a suite of real-world datasets, in some cases reducing the error by nearly half in comparison to other popular methods in the literature. Our results demonstrate that our method is a powerful new modeling technique for conditional probability estimation of ordinal labels, especially in the low-to-mid sample size regimes such as those often found in biological and other physical sciences.

关键词

density estimation; deep learning; neural networks

全文

PDF


Reliable Multi-View Clustering

摘要

With the advent of multi-view data, multi-view learning (MVL) has become an important research direction in machine learning. It is usually expected that multi-view algorithms can obtain better performance than that of merely using a single view. However, previous researches have pointed out that sometimes the utilization of multiple views may even deteriorate the performance. This will be a stumbling block for the practical use of MVL in real applications, especially for tasks requiring high dependability. Thus, it is eager to design reliable multi-view approaches, such that their performance is never degenerated by exploiting multiple views.This issue is vital but rarely studied. In this paper, we focus on clustering and propose the Reliable Multi-View Clustering (RMVC) method. Based on several candidate multi-view clusterings, RMVC maximizes the worst-case performance gain against the best single view clustering, which is equivalently expressed as no label information available. Specifically, employing the squared χ2 distance for clustering comparison makes the formulation of RMVC easy to solve, and an efficient strategy is proposed for optimization. Theoretically, it can be proved that the performance of RMVC will never be significantly decreased under some assumption. Experimental results on a number of data sets demonstrate that the proposed method can effectively improve the reliability of multi-view clustering.

关键词

Multi-View; Clustering; Reliability

全文

PDF


Action Branching Architectures for Deep Reinforcement Learning

摘要

Discrete-action algorithms have been central to numerous recent successes of deep reinforcement learning. However, applying these algorithms to high-dimensional action tasks requires tackling the combinatorial increase of the number of possible actions with the number of action dimensions. This problem is further exacerbated for continuous-action tasks that require fine control of actions via discretization. In this paper, we propose a novel neural architecture featuring a shared decision module followed by several network branches, one for each action dimension. This approach achieves a linear increase of the number of network outputs with the number of degrees of freedom by allowing a level of independence for each individual action dimension. To illustrate the approach, we present a novel agent, called Branching Dueling Q-Network (BDQ), as a branching variant of the Dueling Double Deep Q-Network (Dueling DDQN). We evaluate the performance of our agent on a set of challenging continuous control tasks. The empirical results show that the proposed agent scales gracefully to environments with increasing action dimensionality and indicate the significance of the shared decision module in coordination of the distributed action branches. Furthermore, we show that the proposed agent performs competitively against a state-of-the-art continuous control algorithm, Deep Deterministic Policy Gradient (DDPG).

关键词

Reinforcement Learning; Deep Learning

全文

PDF


Detecting Adversarial Examples Through Image Transformation

摘要

Deep Neural Networks (DNNs) have demonstrated remarkable performance in a diverse range of applications. Along with the prevalence of deep learning, it has been revealed that DNNs are vulnerable to attacks. By deliberately crafting adversarial examples, an adversary can manipulate a DNN to generate incorrect outputs, which may lead catastrophic consequences in applications such as disease diagnosis and self-driving cars. In this paper, we propose an effective method to detect adversarial examples in image classification. Our key insight is that adversarial examples are usually sensitive to certain image transformation operations such as rotation and shifting. In contrast, a normal image is generally immune to such operations. We implement this idea of image transformation and evaluate its performance in oblivious attacks. Our experiments with two datasets show that our technique can detect nearly 99% of adversarial examples generated by the state-of-the-art algorithm. In addition to oblivious attacks, we consider the case of white-box attacks. We propose to introduce randomness in the process of image transformation, which can achieve a detection ratio of around 70%.

关键词

Adversarial examples; Convolutional neural network; Image transformation

全文

PDF


Selective Verification Strategy for Learning From Crowds

摘要

To deal with the low qualities of web workers in crowdsourcing, many unsupervised label aggregation methods have been investigated but most of them provide inconsistent performance. In this paper, we explore the learning from crowds with selective verification problem. In addition to the noisy responses from the crowds, it also collects the ground truths for a well-chosen subset of tasks as the reference, then aggregates the redundant responses based on the patterns provided by both the supervised and unsupervised signal. To improve the labeling efficiency, we propose the EBM selecting strategy for choosing the verification subset, which is based on the loss error minimization. Specifically, we first establish the expected loss error given the semi-supervised learning estimate, then find the subset that minimizes this selecting criterion. We do extensive empirical comparisons on both synthetic and real-world datasets to show the benefits of this new learning setting as well as our proposal.

关键词

label aggregation

全文

PDF


Fourier Feature Approximations for Periodic Kernels in Time-Series Modelling

摘要

Gaussian Processes (GPs) provide an extremely powerful mechanism to model a variety of problems but incur an O(N3) complexity in the number of data samples. Common approximation methods rely on what are often termed inducing points but still typically incur an O(NM2) complexity in the data and corresponding inducing points. Using Random Fourier Feature (RFF) maps, we overcome this by transforming the problem into a Bayesian Linear Regression formulation upon which we apply a Bayesian Variational treatment that also allows learning the corresponding kernel hyperparameters, likelihood and noise parameters. In this paper we introduce an alternative method using Fourier series to obtain spectral representations of common kernels, in particular for periodic warpings, which surprisingly have a convergent, non-random form using special functions, requiring fewer spectral features to approximate their corresponding kernel to high accuracy. Using this, we can fuse the Random Fourier Feature spectral representations of common kernels with their periodic counterparts to show how they can more effectively and expressively learn patterns in time-series for both interpolation and extrapolation. This method combines robustness, scalability and equally importantly, interpretability through a symbolic declarative grammar that is both functionally and humanly intuitive — a property that is crucial for explainable decision making. Using probabilistic programming and Variational Inference we are able to efficiently optimise over these rich functional representations. We show significantly improved Gram matrix approximation errors, and also demonstrate the method in several time-series problems comparing other commonly used approaches such as recurrent neural networks.

关键词

Fourier Series; Random Features; Kernel Approximation; Bayesian Variational Inference; Kernel Compositions

全文

PDF


Sum-Product Autoencoding: Encoding and Decoding Representations Using Sum-Product Networks

摘要

Sum-Product Networks (SPNs) are a deep probabilistic architecture that up to now has been successfully employed for tractable inference. Here, we extend their scope towards unsupervised representation learning: we encode samples into continuous and categorical embeddings and show that they can also be decoded back into the original input space by leveraging MPE inference. We characterize when this Sum-Product Autoencoding (SPAE) leads to equivalent reconstructions and extend it towards dealing with missing embedding information. Our experimental results on several multi-label classification problems demonstrate that SPAE is competitive with state-of-the-art autoencoder architectures, even if the SPNs were never trained to reconstruct their inputs.

关键词

sum-product networks, representation learning, tractable probabilistic models, unsupervised learning, autoencoders

全文

PDF


Bayesian Functional Optimization

摘要

Bayesian optimization (BayesOpt) is a derivative-free approach for sequentially optimizing stochastic black-box functions. Standard BayesOpt, which has shown many successes in machine learning applications, assumes a finite dimensional domain which often is a parametric space. The parameter space is defined by the features used in the function approximations which are often selected manually. Therefore, the performance of BayesOpt inevitably depends on the quality of chosen features. This paper proposes a new Bayesian optimization framework that is able to optimize directly on the domain of function spaces. The resulting framework, Bayesian Functional Optimization (BFO), not only extends the application domains of BayesOpt to functional optimization problems but also relaxes the performance dependency on the chosen parameter space. We model the domain of functions as a reproducing kernel Hilbert space (RKHS), and use the notion of Gaussian processes on a real separable Hilbert space. As a result, we are able to define traditional improvement-based (PI and EI) and optimistic acquisition functions (UCB) as functionals. We propose to optimize the acquisition functionals using analytic functional gradients that are also proved to be functions in a RKHS. We evaluate BFO in three typical functional optimization tasks: i) a synthetic functional optimization problem, ii) optimizing activation functions for a multi-layer perceptron neural network, and iii) a reinforcement learning task whose policies are modeled in RKHS.

关键词

Bayesian optimization; functional optimization; kernel methods

全文

PDF


Kernel Cross-Correlator

摘要

Cross-correlator plays a significant role in many visual perception tasks, such as object detection and tracking. Beyond the linear cross-correlator, this paper proposes a kernel cross-correlator (KCC) that breaks traditional limitations. First, by introducing the kernel trick, the KCC extends the linear cross-correlation to non-linear space, which is more robust to signal noises and distortions. Second, the connection to the existing works shows that KCC provides a unified solution for correlation filters. Third, KCC is applicable to any kernel function and is not limited to circulant structure on training data, thus it is able to predict affine transformations with customized properties. Last, by leveraging the fast Fourier transform (FFT), KCC eliminates direct calculation of kernel vectors, thus achieves better performance yet still with a reasonable computational cost. Comprehensive experiments on visual tracking and human activity recognition using wearable devices demonstrate its robustness, flexibility, and efficiency. The source codes of both experiments are released at https://github.com/wang-chen/KCC.

关键词

cross-correlator; correlation filter; visual tracking; template matching

全文

PDF


Efficient Test-Time Predictor Learning With Group-Based Budget

摘要

Learning a test-time efficient predictor is becoming important for many real-world applications for which accessing the necessary features of a test data is costly. In this paper, we propose a novel approach to learn a linear predictor by introducing binary indicator variables for selecting feature groups and imposing an explicit budget constraint to up-bound the total cost of selected groups. We solve the convex relaxation of the resulting problem, with the optimal solution proved to be integers for most of the elements at the optima and independent of the specific forms of loss functions used. We propose a general and efficient algorithm to solve the relaxation problem by leveraging the existing SVM solvers with various loss functions. For certain loss functions, the proposed algorithm can further take the advantage of SVM solver in the primal to tackle large-scale and high-dimensional data. Experiments on various datasets demonstrate the effectiveness and efficiency of the proposed method by comparing with various baselines.

关键词

全文

PDF


Learning Transferable Subspace for Human Motion Segmentation

摘要

Temporal data clustering is a challenging task. Existing methods usually explore data self-representation strategy, which may hinder the clustering performance in insufficient or corrupted data scenarios. In real-world applications, we are easily accessible to a large amount of related labeled data. To this end, we propose a novel transferable subspace clustering approach by exploring useful information from relevant source data to enhance clustering performance in target temporal data. We manage to transform the original data into a shared low-dimensional and distinctive feature space by jointly seeking an effective domain-invariant projection. In this way, the well-labeled source knowledge can help obtain a more discriminative target representation. Moreover, a graph regularizer is designed to incorporate temporal information to preserve more sequence knowledge into the learned representation. Extensive experiments based on three human motion datasets illustrate that our approach is able to outperform state-of-the-art temporal data clustering methods.

关键词

Transfer Learning; Segmentation

全文

PDF


Information-Theoretic Domain Adaptation Under Severe Noise Conditions

摘要

Cross-domain data reconstruction methods derive a shared transformation across source and target domains. These methods usually make a specific assumption on noise, which exhibits limited ability when the target data are contaminated by different kinds of complex noise in practice. To enhance the robustness of domain adaptation under severe noise conditions, this paper proposes a novel reconstruction based algorithm in an information-theoretic setting. Specifically, benefiting from the theoretical property of correntropy, the proposed algorithm is distinguished with: detecting the contaminated target samples without making any specific assumption on noise; greatly suppressing the negative influence of noise on cross-domain transformation. Moreover, a relative entropy based regularization of the transformation is incorporated to avoid trivial solutions with the reaped theoretic advantages, i.e., non-negativity and scale-invariance. For optimization, a half-quadratic technique is developed to minimize the non-convex information-theoretic objectives with explicitly guaranteed convergence. Experiments on two real-world domain adaptation tasks demonstrate the superiority of our method.

关键词

domain adaptation; information-theoretic learning; correntropy

全文

PDF


Zero-Shot Learning via Class-Conditioned Deep Generative Models

摘要

We present a deep generative model for Zero-Shot Learning (ZSL). Unlike most existing methods for this problem, that represent each class as a point (via a semantic embedding), we represent each seen/unseen class using a class-specific latent-space distribution, conditioned on class attributes. We use these latent-space distributions as a prior for a supervised variational autoencoder (VAE), which also facilitates learning highly discriminative feature representations for the inputs. The entire framework is learned end-to-end using only the seen-class training data. At test time, the label for an unseen-class test input is the class that maximizes the VAE lower bound. We further extend the model to a (i) semi-supervised/transductive setting by leveraging unlabeled unseen-class data via an unsupervised learning module, and (ii) few-shot learning where we also have a small number of labeled inputs from the unseen classes. We compare our model with several state-of-the-art methods through a comprehensive set of experiments on a variety of benchmark data sets.

关键词

全文

PDF


Sparse Gaussian Conditional Random Fields on Top of Recurrent Neural Networks

摘要

Predictions of time-series are widely used in different disciplines. We propose CoR, Sparse Gaussian Conditional Random Fields (SGCRF) on top of Recurrent Neural Networks (RNN), for problems of this kind. CoR gains advantages from both RNN and SGCRF. It can not only effectively represent the temporal correlations in observed data, but can also learn the structured information of the output. CoR is challenging to train because it is a hybrid of deep neural networks and densely-connected graphical models. Alternative training can be a tractable way to train CoR, and furthermore, an end-to-end training method is proposed to train CoR more efficiently. CoR is evaluated by both synthetic data and real-world data, and it shows a significant improvement in performance over state-of-the-art methods.

关键词

gaussian conditional random fields; recurrent neural networks; time-series prediction

全文

PDF


On Multi-Relational Link Prediction With Bilinear Models

摘要

We study bilinear embedding models for the task of multi-relational link prediction and knowledge graph completion. Bilinear models belong to the most basic models for this task, they are comparably efficient to train and use, and they can provide good prediction performance. The main goal of this paper is to explore the expressiveness of and the connections between various bilinear models proposed in the literature. In particular, a substantial number of models can be represented as bilinear models with certain additional constraints enforced on the embeddings. We explore whether or not these constraints lead to universal models, which can in principle represent every set of relations, and whether or not there are subsumption relationships between various models. We report results of an independent experimental study that evaluates recent bilinear models in a common experimental setup. Finally, we provide evidence that relation-level ensembles of multiple bilinear models can achieve state-of-the-art prediction performance.

关键词

Relational Learning; Embedding Learning; Knowledge Graph

全文

PDF


Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

摘要

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia. The aim of this paper is to achieve ultra-high energy efficiency and performance for hardware implementations of deep neural networks (DNNs). An algorithm-hardware co-optimization framework is developed, which is applicable to different DNN types, sizes, and application scenarios. The algorithm part adopts the general block-circulant matrices to achieve a fine-grained tradeoff of accuracy and compression ratio. It applies to both fully-connected and convolutional layers and contains a mathematically rigorous proof of the effectiveness of the method. The proposed algorithm reduces computational complexity per layer from O(n2) to O(n log n) and storage complexity from O(n2) to O(n), both for training and inference. The hardware part consists of highly efficient Field Programmable Gate Array (FPGA)-based implementations using effective reconfiguration, batch processing, deep pipelining, resource re-using, and hierarchical control. Experimental results demonstrate that the proposed framework achieves at least 152X speedup and 71X energy efficiency gain compared with IBM TrueNorth processor under the same test accuracy. It achieves at least 31X energy efficiency gain compared with the reference FPGA-based work.

关键词

Deep learning; block-circulant matrix; acceleration; compression; FPGA

全文

PDF


On the ERM Principle With Networked Data

摘要

Networked data, in which every training example involves two objects and may share some common objects with others, is used in many machine learning tasks such as learning to rank and link prediction. A challenge of learning from networked examples is that target values are not known for some pairs of objects. In this case, neither the classical i.i.d. assumption nor techniques based on complete U-statistics can be used. Most existing theoretical results of this problem only deal with the classical empirical risk minimization (ERM) principle that always weights every example equally, but this strategy leads to unsatisfactory bounds. We consider general weighted ERM and show new universal risk bounds for this problem. These new bounds naturally define an optimization problem which leads to appropriate weights for networked examples. Though this optimization problem is not convex in general, we devise a new fully polynomial-time approximation scheme (FPTAS) to solve it.

关键词

Generalization error bounds; Non-i.i.d. data; U-statistics; Fully polynomial-time approximation scheme

全文

PDF


High Rank Matrix Completion With Side Information

摘要

We address the problem of high-rank matrix completion with side information. In contrast to existing work dealing with side information, which assume that the data matrix is low-rank, we consider the more general scenario where the columns of the data matrix are drawn from a union of low-dimensional subspaces, which can lead to a high rank matrix. Our goal is to complete the matrix while taking advantage of the side information. To do so, we use the self-expressive property of the data, searching for a sparse representation of each column of matrix as a combination of a few other columns. More specifically, we propose a factorization of the data matrix as the product of side information matrices with an unknown interaction matrix, under which each column of the data matrix can be reconstructed using a sparse combination of other columns. As our proposed optimization, searching for missing entries and sparse coefficients, is non-convex and NP-hard, we propose a lifting framework, where we couple sparse coefficients and missing values and define an equivalent optimization that is amenable to convex relaxation. We also propose a fast implementation of our convex framework using a Linearized Alternating Direction Method. By extensive experiments on both synthetic and real data, and, in particular, by studying the problem of multi-label learning, we demonstrate that our method outperforms existing techniques in both low-rank and high-rank data regimes.

关键词

matrix completion; sparse representation; multilabel learning; high-rank data

全文

PDF


Adversarial Learning of Portable Student Networks

摘要

Effective methods for learning deep neural networks with fewer parameters are urgently required, since storage and computations of heavy neural networks have largely prevented their widespread use on mobile devices. Compared with algorithms which directly remove weights or filters for obtaining considerable compression and speed-up ratios, training thin deep networks exploiting the student-teacher learning paradigm is more flexible. However, it is very hard to determine which formulation is optimal to measure the information inherited from teacher networks. To overcome this challenge, we utilize the generative adversarial network (GAN) to learn the student network. In practice, the generator is exactly the student network with extremely less parameters and the discriminator is used as a teaching assistant for distinguishing features extracted from student and teacher networks. By simultaneously optimizing the generator and the discriminator, the resulting student network can produce features of input data with the similar distribution as that of features of the teacher network. Extensive experimental results on benchmark datasets demonstrate that the proposed method is capable of learning well-performed portable networks, which is superior to the state-of-the-art methods.

关键词

全文

PDF


Orthant-Wise Passive Descent Algorithms for Training L1-Regularized Models

摘要

The L1-regularized models are widely used for sparse regression or classification tasks. In this paper, we propose the orthant-wise passive descent algorithm (OPDA) for solving L1-regularized models, as an improved substitute of proximal algorithms, which are the standard tools for optimizing the models nowadays. OPDA uses a stochastic variance-reduced gradient (SVRG) to initialize the descent direction, then apply a novel alignment operator to encourage each element keeping the same sign after one iteration of update, so the parameter remains in the same orthant as before. It also explicitly suppresses the magnitude of each element to impose sparsity. The quasi-Newton update can be utilized to incorporate curvature information and accelerate the speed. We prove a linear convergence rate for OPDA on general smooth and strongly-convex loss functions. By conducting experiments on L1-regularized logistic regression and convolutional neural networks, we show that OPDA outperforms state-of-the-art stochastic proximal algorithms, implying a wide range of applications in training sparse models.

关键词

convex optimization; sparse learning; quasi newton

全文

PDF


MERCS: Multi-Directional Ensembles of Regression and Classification Trees

摘要

Learning a function f(X) that predicts Y from X is the archetypal Machine Learning (ML) problem. Typically, both sets of attributes (i.e., X,Y) have to be known before a model can be trained. When this is not the case, or when functions f(X) that predict Y from X are needed for varying X and Y, this may introduce significant overhead (separate learning runs for each function). In this paper, we explore the possibility of omitting the specification of X and Y at training time altogether, by learning a multi-directional, or versatile model, which will allow prediction of any Y from any X. Specifically, we introduce a decision tree-based paradigm that generalizes the well-known Random Forests approach to allow for multi-directionality. The result of these efforts is a novel method called MERCS: Multi-directional Ensembles of Regression and Classification treeS. Experiments show the viability of the approach.

关键词

Decision Trees; Random Forests; Versatile Models;

全文

PDF


Decoupled Convolutions for CNNs

摘要

In this paper, we are interested in designing small CNNs by decoupling the convolution along the spatial and channel domains. Most existing decoupling techniques focus on approximating the filter matrix through decomposition. In contrast, we provide a two-step interpretation of the standard convolution from the filter at a single location to all locations, which is exactly equivalent to the standard convolution. Motivated by the observations in our decoupling view, we propose an effective approach to relax the sparsity of the filter in spatial aggregation by learning a spatial configuration, and reduce the redundancy by reducing the number of intermediate channels. Our approach achieves comparable classification performance with the standard uncoupled convolution, but with a smaller model size over CIFAR-100, CIFAR-10 and ImageNet.

关键词

decomposing filter; decoupled convolution; balance decoupling spatial convolution; spatial configuration

全文

PDF


Cooperative Learning of Energy-Based Model and Latent Variable Model via MCMC Teaching

摘要

This paper proposes a cooperative learning algorithm to train both the undirected energy-based model and the directed latent variable model jointly. The learning algorithm interweaves the maximum likelihood algorithms for learning the two models, and each iteration consists of the following two steps: (1) Modified contrastive divergence for energy-based model: The learning of the energy-based model is based on the contrastive divergence, but the finite-step MCMC sampling of the model is initialized from the synthesized examples generated by the latent variable model instead of being initialized from the observed examples. (2) MCMC teaching of the latent variable model: The learning of the latent variable model is based on how the MCMC in (1) changes the initial synthesized examples generated by the latent variable model, where the latent variables that generate the initial synthesized examples are known so that the learning is essentially supervised. Our experiments show that the cooperative learning algorithm can learn realistic models of images.

关键词

Deep generative models; Convolutional neural networks; Contrastive divergence; MCMC teaching

全文

PDF


Partial Multi-Label Learning

摘要

It is expensive and difficult to precisely annotate objects with multiple labels. Instead, in many real tasks, annotators may roughly assign each object with a set of candidate labels. The candidate set contains at least one but unknown number of ground-truth labels, and is usually adulterated with some irrelevant labels. In this paper, we formalize such problems as a new learning framework called partial multi-label learning (PML). To solve the PML problem, a confidence value is maintained for each candidate label to estimate how likely it is a ground-truth label of the instance. On one hand, the relevance ordering of labels on each instance is optimized by minimizing a rank loss weighted by the confidences; on the other hand, the confidence values are optimized by further exploiting structure information in feature and label spaces.Experimental results on various datasets show that the proposed approach is effective for solving PML problems.

关键词

全文

PDF


Semi-Supervised AUC Optimization Without Guessing Labels of Unlabeled Data

摘要

Semi-supervised learning, which aims to construct learners that automatically exploit the large amount of unlabeled data in addition to the limited labeled data, has been widely applied in many real-world applications. AUC is a well-known performance measure for a learner, and directly optimizing AUC may result in a better prediction performance. Thus, semi-supervised AUC optimization has drawn much attention. Existing semi-supervised AUC optimization methods exploit unlabeled data by explicitly or implicitly estimating the possible labels of the unlabeled data based on various distributional assumptions. However, these assumptions may be violated in many real-world applications, and estimating labels based on the violated assumption may lead to poor performance. In this paper, we argue that, in semi-supervised AUC optimization, it is unnecessary to guess the possible labels of the unlabeled data or prior probability based on any distributional assumptions. We analytically show that the AUC risk can be estimated unbiasedly by simply treating the unlabeled data as both positive and negative. Based on this finding, two semi-supervised AUC optimization methods named Samult and Sampura are proposed. Experimental results indicate that the proposed methods outperform the existing methods.

关键词

Semi-Supervised Learning; AUC Optimization

全文

PDF


Perception Coordination Network: A Framework for Online Multi-Modal Concept Acquisition and Binding

摘要

A biologically plausible neural network model named Perception Coordination Network (PCN) is proposed for online multi-modal concept acquisition and binding. It is a hierarchical structure inspired by the structure of the brain, and functionally divided into the primary sensory area (PSA), the primary sensory association area (SAA), and the higher order association area (HAA). The PSA processes many elementary features, e.g., colors, shapes, syllables, and basic flavors, etc. The SAA combines these elementary features to represent the unimodal concept of an object, e.g., the image, name and taste of an apple, etc. The HAA connects several primary sensory association areas like a function of synaesthesia, which means associating the image, name and taste of an object. PCN is able to continuously acquire and bind multi-modal concepts in an online way. Experimental results suggest that PCN can handle the multi-modal concept acquisition and binding problem effectively.

关键词

multi-modal learning; concept acquisition and binding; online incremental learning;

全文

PDF


HodgeRank With Information Maximization for Crowdsourced Pairwise Ranking Aggregation

摘要

Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains. However, task requester usually has a limited amount of budget, thus it is desirable to have a policy to wisely allocate the budget to achieve better quality. In this paper, we study the principle of information maximization for active sampling strategies in the framework of HodgeRank, an approach based on Hodge Decomposition of pairwise ranking data with multiple workers. The principle exhibits two scenarios of active sampling: Fisher information maximization that leads to unsupervised sampling based on a sequential maximization of graph algebraic connectivity without considering labels; and Bayesian information maximization that selects samples with the largest information gain from prior to posterior, which gives a supervised sampling involving the labels collected. Experiments show that the proposed methods boost the sampling efficiency as compared to traditional sampling schemes and are thus valuable to practical crowdsourcing experiments.

关键词

全文

PDF


Deep Neural Network Compression With Single and Multiple Level Quantization

摘要

Network quantization is an effective solution to compress deep neural networks for practical usage. Existing network quantization methods cannot sufficiently exploit the depth information to generate low-bit compressed network. In this paper, we propose two novel network quantization approaches, single-level network quantization (SLQ) for high-bit quantization and multi-level network quantization (MLQ) for extremely low-bit quantization (ternary). We are the first to consider the network quantization from both width and depth level. In the width level, parameters are divided into two parts: one for quantization and the other for re-training to eliminate the quantization loss. SLQ leverages the distribution of the parameters to improve the width level. In the depth level, we introduce incremental layer compensation to quantize layers iteratively which decreases the quantization loss in each iteration. The proposed approaches are validated with extensive experiments based on the state-of-the-art neural networks including AlexNet, VGG-16, GoogleNet and ResNet-18. Both SLQ and MLQ achieve impressive results.

关键词

Network compression; Network quantization

全文

PDF


Informed Non-Convex Robust Principal Component Analysis With Features

摘要

We revisit the problem of robust principal component analysis with features acting as prior side information. To this aim, a novel, elegant, non-convex optimization approach is proposed to decompose a given observation matrix into a low-rank core and the corresponding sparse residual. Rigorous theoretical analysis of the proposed algorithm results in exact recovery guarantees with low computational complexity. Aptly designed synthetic experiments demonstrate that our method is the first to wholly harness the power of non-convexity over convexity in terms of both recoverability and speed. That is, the proposed non-convex approach is more accurate and faster compared to the best available algorithms for the problem under study. Two real-world applications, namely image classification and face denoising further exemplify the practical superiority of the proposed method.

关键词

RPCA; Nonconvex;Low-rank;Convergence

全文

PDF


Dictionary Learning in Optimal Metric Space

摘要

Dictionary learning has been widely used in machine learning field to address many real-world applications, such as classification and denoising. In recent years, many new dictionary learning methods have been proposed. Most of them are designed to solve unsupervised problem without any prior information or supervised problem with the label information. But in real world, as usual, we can only obtain limited side information as prior information rather than label information. The existing methods don’t take into account the side information, let alone learning a good dictionary through using the side information. To tackle it, we propose a new unified unsupervised model which naturally integrates metric learning to enhance dictionary learning model with fully utilizing the side information. The proposed method updates metric space and dictionary adaptively and alternatively, which ensures learning optimal metric space and dictionary simultaneously. Besides, our method can also deal well with highdimensional data. Extensive experiments show the efficiency of our proposed method, and a better performance can be derived in real-world image clustering applications.

关键词

Dictionary Learning; Metric Space

全文

PDF


Automatic Model Selection in Subspace Clustering via Triplet Relationships

摘要

This paper addresses both the model selection (i.e., estimating the number of clusters K) and subspace clustering problems in a unified model. The real data always distribute on a union of low-dimensional sub-manifolds which are embedded in a high-dimensional ambient space. In this regard, the state-of-the-art subspace clustering approaches firstly learn the affinity among samples, followed by a spectral clustering to generate the segmentation. However, arguably, the intrinsic geometrical structures among samples are rarely considered in the optimization process. In this paper, we propose to simultaneously estimate K and segment the samples according to the local similarity relationships derived from the affinity matrix. Given the correlations among samples, we define a novel data structure termed the Triplet, each of which reflects a high relevance and locality among three samples which are aimed to be segmented into the same subspace. While the traditional pairwise distance can be close between inter-cluster samples lying on the intersection of two subspaces, the wrong assignments can be avoided by the hyper-correlation derived from the proposed triplets due to the complementarity of multiple constraints. Sequentially, we propose to greedily optimize a new model selection reward to estimate K according to the correlations between inter-cluster triplets. We simultaneously optimize a fusion reward based on the similarities between triplets and clusters to generate the final segmentation. Extensive experiments on the benchmark datasets demonstrate the effectiveness and robustness of the proposed approach.

关键词

全文

PDF


A Poisson Gamma Probabilistic Model for Latent Node-Group Memberships in Dynamic Networks

摘要

We present a probabilistic model for learning from dynamic relational data, wherein the observed interactions among networked nodes are modeled via the Bernoulli Poisson link function, and the underlying network structure are characterized by nonnegative latent node-group memberships, which are assumed to be gamma distributed. The latent memberships evolve according to Markov processes.The optimal number of latent groups can be determined by data itself. The computational complexity of our method scales with the number of non-zero links, which makes it scalable to large sparse dynamic relational data. We present batch and online Gibbs sampling algorithms to perform model inference. Finally, we demonstrate the model's performance on both synthetic and real-world datasets compared to state-of-the-art methods.

关键词

Bayesian Learning;Networks;Social Networking and Community Identification

全文

PDF


New l2,1-Norm Relaxation of Multi-Way Graph Cut for Clustering

摘要

The clustering methods have absorbed even-increasing attention in machine learning and computer vision communities in recent years. Exploring manifold information in multi-way graph cut clustering, such as ratio cut clustering, has shown its promising performance. However, traditional multi-way ratio cut clustering method is NP-hard and thus the spectral solution may deviate from the optimal one. In this paper, we propose a new relaxed multi-way graph cut clustering method, where l2,1-norm distance instead of squared distance is utilized to preserve the solution having much more clearer cluster structures. Furthermore, the resulting solution is constrained with normalization to obtain more sparse representation, which can encourage the solution to contain more discrete values with many zeros. For the objective function, it is very difficult to optimize due to minimizing the ratio of two non-smooth items. To address this problem, we transform the objective function into a quadratic problem on the Stiefel manifold (QPSM), and introduce a novel yet efficient iterative algorithm to solve it. Experimental results on several benchmark datasets show that our method significantly outperforms several state-of-the-art clustering approaches.

关键词

Clustering, Graph Cut

全文

PDF


Efficient K-Shot Learning With Regularized Deep Networks

摘要

Feature representations from pre-trained deep neural networks have been known to exhibit excellent generalization and utility across a variety of related tasks. Fine-tuning is by far the simplest and most widely used approach that seeks to exploit and adapt these feature representations to novel tasks with limited data. Despite the effectiveness of fine-tuning, it is often sub-optimal and requires very careful optimization to prevent severe over-fitting to small datasets. The problem of sub-optimality and overfitting, is due in part to the large number of parameters used in a typical deep convolutional neural network. To address these problems, we propose a simple yet effective regularization method for fine-tuning pre-trained deep networks for the task of k-shot learning. To prevent overfitting, our key strategy is to cluster the model parameters while ensuring intra-cluster similarity and inter-cluster diversity of the parameters, effectively regularizing the dimensionality of the parameter search space. In particular, we identify groups of neurons within each layer of a deep network that shares similar activation patterns. When the network is to be fine-tuned for a classification task using only k examples, we propagate a single gradient to all of the neuron parameters that belong to the same group. The grouping of neurons is non-trivial as neuron activations depend on the distribution of the input data. To efficiently search for optimal groupings conditioned on the input data, we propose a reinforcement learning search strategy using recurrent networks to learn the optimal group assignments for each network layer. Experimental results show that our method can be easily applied to several popular convolutional neural networks and improve upon other state-of-the-art fine-tuning based k-shot learning strategies by more than 10%.

关键词

deep learning; k-shot learning; regularization

全文

PDF


Learning With Single-Teacher Multi-Student

摘要

In this paper we study a new learning problem defined as "Single-Teacher Multi-Student" (STMS) problem, which investigates how to learn a series of student (simple and specific) models from a single teacher (complex and universal) model. Taking the multiclass and binary classification for example, we focus on learning multiple binary classifiers from a single multiclass classifier, where each of binary classifier is responsible for a certain class. This actually derives from some realistic problems, such as identifying the suspect based on a comprehensive face recognition system. By treating the already-trained multiclass classifier as the teacher, and multiple binary classifiers as the students, we propose a gated support vector machine (gSVM) as a solution. A series of gSVMs are learned with the help of single teacher multiclass classifier. The teacher's help is two-fold; first, the teacher's score provides the gated values for students' decision; second, the teacher can guide the students to accommodate training examples with different difficulty degrees. Extensive experiments on real datasets validate its effectiveness.

关键词

multiclass classification; binary classification; teacher student

全文

PDF


Tau-FPL: Tolerance-Constrained Learning in Linear Time

摘要

In many real-world applications, learning a classifier with false-positive rate under a specified tolerance is appealing. Existing approaches either introduce prior knowledge dependent label cost or tune parameters based on traditional classifiers, which are of limitation in methodology since they do not directly incorporate the false-positive rate tolerance. In this paper, we propose a novel scoring-thresholding approach, tau-False Positive Learning (tau-FPL) to address this problem. We show that the scoring problem which takes the false-positive rate tolerance into accounts can be efficiently solved in linear time, also an out-of-bootstrap thresholding method can transform the learned ranking function into a low false-positive classifier. Both theoretical analysis and experimental results show superior performance of the proposed tau-FPL over the existing approaches.

关键词

Neyman-Pearson Classification; Euclidean Projection; Partial-AUC Optimization; Learning to Rank

全文

PDF


Multi-Layer Multi-View Classification for Alzheimer’s Disease Diagnosis

摘要

In this paper, we propose a novel multi-view learning method for Alzheimer's Disease (AD) diagnosis, using neuroimaging and genetics data. Generally, there are several major challenges associated with traditional classification methods on multi-source imaging and genetics data. First, the correlation between the extracted imaging features and class labels is generally complex, which often makes the traditional linear models ineffective. Second, medical data may be collected from different sources (i.e., multiple modalities of neuroimaging data, clinical scores or genetics measurements), therefore, how to effectively exploit the complementarity among multiple views is of great importance. In this paper, we propose a Multi-Layer Multi-View Classification (ML-MVC) approach, which regards the multi-view input as the first layer, and constructs a latent representation to explore the complex correlation between the features and class labels. This captures the high-order complementarity among different views, as we exploit the underlying information with a low-rank tensor regularization. Intrinsically, our formulation elegantly explores the nonlinear correlation together with complementarity among different views, and thus improves the accuracy of classification. Finally, the minimization problem is solved by the Alternating Direction Method of Multipliers (ADMM). Experimental results on Alzheimer's Disease Neuroimaging Initiative (ADNI) data sets validate the effectiveness of our proposed method.

关键词

multi-view learning; Neuroimage classification

全文

PDF


Latent Semantic Aware Multi-View Multi-Label Classification

摘要

For real-world applications, data are often associated with multiple labels and represented with multiple views. Most existing multi-label learning methods do not sufficiently consider the complementary information among multiple views, leading to unsatisfying performance. To address this issue, we propose a novel approach for multi-view multi-label learning based on matrix factorization to exploit complementarity among different views. Specifically, under the assumption that there exists a common representation across different views, the uncovered latent patterns are enforced to be aligned across different views in kernel spaces. In this way, the latent semantic patterns underlying in data could be well uncovered and this enhances the reasonability of the common representation of multiple views. As a result, the consensus multi-view representation is obtained which encodes the complementarity and consistence of different views in latent semantic space. We provide theoretical guarantee for the strict convexity for our method by properly setting parameters. Empirical evidence shows the clear advantages of our method over the state-of-the-art ones.

关键词

multi-view learning; multi-label learning

全文

PDF


ROAR: Robust Label Ranking for Social Emotion Mining

摘要

Understanding and predicting latent emotions of users toward online contents, known as social emotion mining, has become increasingly important to both social platforms and businesses alike. Despite recent developments, however, very little attention has been made to the issues of nuance, subjectivity, and bias of social emotions. In this paper, we fill this gap by formulating social emotion mining as a robust label ranking problem, and propose: (1) a robust measure, named as G-mean-rank (GMR), which sets a formal criterion consistent with practical intuition; and (2) a simple yet effective label ranking model, named as ROAR, that is more robust toward unbalanced datasets (which are common). Through comprehensive empirical validation using 4 real datasets and 16 benchmark semi-synthetic label ranking datasets, and a case study, we demonstrate the superiorities of our proposals over 2 popular label ranking measures and 6 competing label ranking algorithms. The datasets and implementations used in the empirical validation are available for access.

关键词

Label Ranking; Imbalanced Data

全文

PDF


Beyond Link Prediction: Predicting Hyperlinks in Adjacency Space

摘要

This paper addresses the hyperlink prediction problem in hypernetworks. Different from the traditional link prediction problem where only pairwise relations are considered as links, our task here is to predict the linkage of multiple nodes, i.e., hyperlink. Each hyperlink is a set of an arbitrary number of nodes which together form a multiway relationship. Hyperlink prediction is challenging---since the cardinality of a hyperlink is variable, existing classifiers based on a fixed number of input features become infeasible. Heuristic methods, such as the common neighbors and Katz index, do not work for hyperlink prediction, since they are restricted to pairwise similarities. In this paper, we formally define the hyperlink prediction problem, and propose a new algorithm called Coordinated Matrix Minimization (CMM), which alternately performs nonnegative matrix factorization and least square matching in the vertex adjacency space of the hypernetwork, in order to infer a subset of candidate hyperlinks that are most suitable to fill the training hypernetwork. We evaluate CMM on two novel tasks: predicting recipes of Chinese food, and finding missing reactions of metabolic networks. Experimental results demonstrate the superior performance of our method over many seemingly promising baselines.

关键词

link prediction; hypergraph; hyperlink; EM; metabolic network

全文

PDF


An End-to-End Deep Learning Architecture for Graph Classification

摘要

Neural networks are typically designed to deal with data in tensor forms. In this paper, we propose a novel neural network architecture accepting graphs of arbitrary structure. Given a dataset containing graphs in the form of (G,y) where G is a graph and y is its class, we aim to develop neural networks that read the graphs directly and learn a classification function. There are two main challenges: 1) how to extract useful features characterizing the rich information encoded in a graph for classification purpose, and 2) how to sequentially read a graph in a meaningful and consistent order. To address the first challenge, we design a localized graph convolution model and show its connection with two graph kernels. To address the second challenge, we design a novel SortPooling layer which sorts graph vertices in a consistent order so that traditional neural networks can be trained on the graphs. Experiments on benchmark graph classification datasets demonstrate that the proposed architecture achieves highly competitive performance with state-of-the-art graph kernels and other graph neural network methods. Moreover, the architecture allows end-to-end gradient-based training with original graphs, without the need to first transform graphs into vectors.

关键词

graph classification; graph neural networks; graph kernel

全文

PDF


Feature-Induced Labeling Information Enrichment for Multi-Label Learning

摘要

In multi-label learning, each training example is represented by a single instance (feature vector) while associated with multiple class labels simultaneously. The task is to learn a predictive model from the training examples which can assign a set of proper labels for the unseen instance. Most existing approaches make use of multi-label training examples by exploiting their labeling information in a crisp manner, i.e. one class label is either fully relevant or irrelevant to the instance. In this paper, a novel multi-label learning approach is proposed which aims to enrich the labeling information by leveraging the structural information in feature space. Firstly, the underlying structure of feature space is characterized by conducting sparse reconstruction among the training examples. Secondly, the reconstruction information is conveyed from feature space to label space so as to enrich the original categorical labels into numerical ones. Thirdly, the multi-label predictive model is induced by learning from training examples with enriched labeling information. Extensive experiments on fifteen benchmark data sets clearly validate the effectiveness of the proposed feature-induced strategy for enhancing labeling information of multi-label examples.

关键词

全文

PDF


Interpreting CNN Knowledge via an Explanatory Graph

摘要

This paper learns a graphical model, namely an explanatory graph, which reveals the knowledge hierarchy hidden inside a pre-trained CNN. Considering that each filter in a conv-layer of a pre-trained CNN usually represents a mixture of object parts, we propose a simple yet efficient method to automatically disentangles different part patterns from each filter, and construct an explanatory graph. In the explanatory graph, each node represents a part pattern, and each edge encodes co-activation relationships and spatial relationships between patterns. More importantly, we learn the explanatory graph for a pre-trained CNN in an unsupervised manner, i.e., without a need of annotating object parts. Experiments show that each graph node consistently represents the same object part through different images. We transfer part patterns in the explanatory graph to the task of part localization, and our method significantly outperforms other approaches.

关键词

Convolutional Neural Network; Graphical Model; Interpretable Model

全文

PDF


Examining CNN Representations With Respect to Dataset Bias

摘要

Given a pre-trained CNN without any testing samples, this paper proposes a simple yet effective method to diagnose feature representations of the CNN. We aim to discover representation flaws caused by potential dataset bias. More specifically, when the CNN is trained to estimate image attributes, we mine latent relationships between representations of different attributes inside the CNN. Then, we compare the mined attribute relationships with ground-truth attribute relationships to discover the CNN's blind spots and failure modes due to dataset bias. In fact, representation flaws caused by dataset bias cannot be examined by conventional evaluation strategies based on testing images, because testing images may also have a similar bias. Experiments have demonstrated the effectiveness of our method.

关键词

Deep learning; Interpretable model; Knowledge representation

全文

PDF


Optimal Margin Distribution Clustering

摘要

Maximum margin clustering (MMC), which borrows the large margin heuristic from support vector machine (SVM), has achieved more accurate results than traditional clustering methods. The intuition is that, for a good clustering, when labels are assigned to different clusters, SVM can achieve a large minimum margin on this data. Recent studies, however, disclosed that maximizing the minimum margin does not necessarily lead to better performance, and instead, it is crucial to optimize the margin distribution. In this paper, we propose a novel approach ODMC (Optimal margin Distribution Machine for Clustering), which tries to cluster the data and achieve optimal margin distribution simultaneously. Specifically, we characterize the margin distribution by the first- and second-order statistics, i.e., the margin mean and variance, and extend a stochastic mirror descent method to solve the resultant minimax problem. Moreover, we prove theoretically that ODMC has the same convergence rate with state-of-the-art cutting plane based algorithms but involves much less computation cost per iteration, so our method is much more scalable than existing approaches. Extensive experiments on UCI data sets show that ODMC is significantly better than compared methods, which verifies the superiority of optimal margin distribution learning.

关键词

全文

PDF


Training Set Debugging Using Trusted Items

摘要

Training set bugs are flaws in the data that adversely affect machine learning. The training set is usually too large for manual inspection, but one may have the resources to verify a few trusted items. The set of trusted items may not by itself be adequate for learning, so we propose an algorithm that uses these items to identify bugs in the training set and thus improves learning. Specifically, our approach seeks the smallest set of changes to the training set labels such that the model learned from this corrected training set predicts labels of the trusted items correctly. We flag the items whose labels are changed as potential bugs, whose labels can be checked for veracity by human experts. To find the bugs in this way is a challenging combinatorial bilevel optimization problem, but it can be relaxed into a continuous optimization problem.Experiments on toy and real data demonstrate that our approach can identify training set bugs effectively and suggest appropriate changes to the labels. Our algorithm is a step toward trustworthy machine learning.

关键词

Machine Learning, Debugging, Data Cleaning, trustworthy machine learning

全文

PDF


EMD Metric Learning

摘要

Earth Mover's Distance (EMD), targeting at measuring the many-to-many distances, has shown its superiority and been widely applied in computer vision tasks, such as object recognition, hyperspectral image classification and gesture recognition. However, there is still little effort concentrated on optimizing the EMD metric towards better matching performance. To tackle this issue, we propose an EMD metric learning algorithm in this paper. In our method, the objective is to learn a discriminative distance metric for EMD ground distance matrix generation which can better measure the similarity between compared subjects. More specifically, given a group of labeled data from different categories, we first select a subset of training data and then optimize the metric for ground distance matrix generation. Here, both the EMD metric and the EMD flow-network are alternatively optimized until a steady EMD value can be achieved. This method is able to generate a discriminative ground distance matrix which can further improve the EMD distance measurement. We then apply our EMD metric learning method on two tasks, i.e., multi-view object classification and document classification. The experimental results have shown better performance of our proposed EMD metric learning method compared with the traditional EMD method and the state-of-the-art methods. It is noted that the proposed EMD metric learning method can be also used in other applications.

关键词

Vision; Object Recognition; Classification

全文

PDF


Distant-Supervision of Heterogeneous Multitask Learning for Social Event Forecasting With Multilingual Indicators

摘要

Open-source indicators such as social media can be very effective precursors for forecasting future societal events. As events are often preceded by social indicators generated by groups of people speaking many different languages, multiple languages need to be considered to ensure comprehensive event forecasting. However, this leads to several technical challenges for traditional models: 1) high dimension, sparsity, and redundancy of features; 2) translation correlation among the multilingual features. and 3) lack of language-wise supervision. In order to simultaneously address these issues, we present a novel model capable of distant-supervision of heterogeneous multitask learning (DHML) for multilingual spatial social event forecasting. This model maps the multilingual heterogeneous features into several latent semantic spaces and then enforces a similar sparsity pattern across them all, using distant supervision across all the languages involved. Optimizing this model creates a difficult problem that is nonconvex and nonsmooth that can then be decomposed into simpler subproblems using the Alternative Direction Multiplier of Methods (ADMM). A novel dynamic programming-based algorithm is proposed to solve one challenging subproblem efficiently. Theoretical properties of the proposed algorithm are analyzed. The results of extensive experiments on multiple real-world datasets are presented to demonstrate the effectiveness, efficiency, and interpretability of the proposed approach.

关键词

multi-task learning; multi-instance learning; spatial event forecasting

全文

PDF


Label Distribution Learning by Optimal Transport

摘要

Label distribution learning (LDL) is a novel learning paradigm to deal with some real-world applications, especially when we care more about the relative importance of different labels in description of an instance. Although some approaches have been proposed to learn the label distribution, they could not explicitly learn and leverage the label correlation, which plays an importance role in LDL. In this paper, we proposed an approach to learn the label distribution and exploit label correlations simultaneously based on the Optimal Transport (OT) theory. The problem is solved by alternatively learning the transportation (hypothesis) and ground metric (label correlations). Besides, we provide perhaps the first data-dependent risk bound analysis for label distribution learning by Sinkhorn distance, a commonly-used relaxation for OT distance. Experimental results on several real-world datasets comparing with several state-of-the-art methods validate the effectiveness of our approach.

关键词

machine learning; optimal transport;label distribution learning

全文

PDF


Substructure Assembling Network for Graph Classification

摘要

Graphs are natural data structures adopted to represent real-world data of complex relationships. In recent years, a surge of interest has been received to build predictive models over graphs, with prominent examples in chemistry, computational biology, and social networks. The overwhelming complexity of graph space often makes it challenging to extract interpretable and discriminative structural features for classification tasks. In this work, we propose a novel neural network structure called Substructure Assembling Network (SAN) to extract graph features and improve the generalization performance of graph classification. The key innovation of our work is a unified substructure assembling unit, which is a variant of Recurrent Neural Network (RNN) designed to hierarchically assemble useful pieces of graph components so as to fabricate discriminative substructures. SAN adopts a sequential, probabilistic decision process, and therefore it can tune substructure features in a finer granularity. Meanwhile, the parameterized soft decisions can be continuously improved with supervised learning through back-propagation, leading to optimizable search trajectories. Overall, SAN embraces both the flexibility of combinatorial pattern search and the strong optimizability of deep learning, and delivers promising results as well as interpretable structural features in graph classification against state-of-the-art techniques.

关键词

graph learning; graph representation learning; graph classification; deep graph neural network; deep learning

全文

PDF


Hypergraph Learning With Cost Interval Optimization

摘要

In many classification tasks, the misclassification costs of different categories usually vary significantly. Under such circumstances, it is essential to identify the importance of different categories and thus assign different misclassification losses in many applications, such as medical diagnosis, saliency detection and software defect prediction. However, we note that it is infeasible to determine the accurate cost value without great domain knowledge. In most common cases, we may just have the information that which category is more important than the other categories, i.e., the identification of defect-prone softwares is more important than that of defect-free. To tackle these issues, in this paper, we propose a hypergraph learning method with cost interval optimization, which is able to handle cost interval when data is formulated using the high-order relationships. In this way, data correlations are modeled by a hypergraph structure, which has the merit to exploit the underlying relationships behind the data. With a cost-sensitive hypergraph structure, in order to improve the performance of the classifier without precise cost value, we further introduce cost interval optimization to hypergraph learning. In this process, the optimization on cost interval achieves better performance instead of choosing uncertain fixed cost in the learning process. To evaluate the effectiveness of the proposed method, we have conducted experiments on two groups of dataset, i.e., the NASA Metrics Data Program (NASA) dataset and UCI Machine Learning Repository (UCI) dataset. Experimental results and comparisons with state-of-the-art methods have exhibited better performance of our proposed method.

关键词

Cost interval, Hypergraph

全文

PDF


Learning Mixtures of Random Utility Models

摘要

We tackle the problem of identifiability and efficient learning of mixtures of Random Utility Models (RUMs). We show that when the PDFs of utility distributions are symmetric, the mixture of k RUMs (denoted by k-RUM) is not identifiable when the number of alternatives m is no more than 2k-1. On the other hand, when m ≥ max{4k-2,6}, any k-RUM is generically identifiable. We then propose three algorithms for learning mixtures of RUMs: an EM-based algorithm, which we call E-GMM, a direct generalized-method-of-moments (GMM) algorithm, and a sandwich (GMM-E-GMM) algorithm that combines the other two. Experiments on synthetic data show that the sandwich algorithm achieves the highest statistical efficiency and GMM is the most computationally efficient. Experiments on real-world data at Preflib show that Gaussian k-RUMs provide better fitness than a single Gaussian RUM, the Plackett-Luce model, and mixtures of Plackett-Luce models w.r.t. commonly-used model fitness criteria. To the best of our knowledge, this is the first work on learning mixtures of general RUMs.

关键词

random utility model (RUM), mixture model, identifiability, EM, generalized method of moments

全文

PDF


Direct Hashing Without Pseudo-Labels

摘要

Recently, binary hashing has been widely applied to data compression, ranking and nearest-neighbor search. Although some promising results have been achieved, effectively optimizing sign function related objectives is still highly challenging and thus pseudo-labels are inevitably used. In this paper, we propose a novel general framework to simultaneously minimize the measurement distortion and the quantization loss, which enable to learn hash functions directly without requiring the pseudo-labels. More significantly, a novel W-Shape Loss (WSL) is specifically developed for hashing so that both the two separate steps of relaxation and the NP-hard discrete optimization are successfully discarded. The experimental results demonstrate that the retrieval performance both in uni-modal and cross-modal settings can be improved.

关键词

Hamming, Binary Codes, Quantization Loss, Pseudo-Label

全文

PDF


Learning Graph-Structured Sum-Product Networks for Probabilistic Semantic Maps

摘要

We introduce Graph-Structured Sum-Product Networks (GraphSPNs), a probabilistic approach to structured prediction for problems where dependencies between latent variables are expressed in terms of arbitrary, dynamic graphs. While many approaches to structured prediction place strict constraints on the interactions between inferred variables, many real-world problems can be only characterized using complex graph structures of varying size, often contaminated with noise when obtained from real data. Here, we focus on one such problem in the domain of robotics. We demonstrate how GraphSPNs can be used to bolster inference about semantic, conceptual place descriptions using noisy topological relations discovered by a robot exploring large-scale office spaces. Through experiments, we show that GraphSPNs consistently outperform the traditional approach based on undirected graphical models, successfully disambiguating information in global semantic maps built from uncertain, noisy local evidence. We further exploit the probabilistic nature of the model to infer marginal distributions over semantic descriptions of as yet unexplored places and detect spatial environment configurations that are novel and incongruent with the known evidence.

关键词

Machine Learning, Robotics, Semantic Mapping, Structured Prediction, Deep Learning

全文

PDF


Label Distribution Learning by Exploiting Sample Correlations Locally

摘要

Label distribution learning (LDL) is a novel multi-label learning paradigm proposed in recent years for solving label ambiguity. Existing approaches typically exploit label correlations globally to improve the effectiveness of label distribution learning, by assuming that the label correlations are shared by all instances. However, different instances may share different label correlations, and few correlations are globally applicable in real-world applications. In this paper, we propose a new label distribution learning algorithm by exploiting sample correlations locally (LDL-SCL). To encode the influence of local samples, we design a local correlation vector for each instance based on the clustered local samples. Then we predict the label distribution for an unseen instance based on the original features and the local correlation vector simultaneously. Experimental results demonstrate that LDL-SCL can effectively deal with the label distribution problems and perform remarkably better than the state-of-the-art LDL methods.

关键词

Label distribution learning

全文

PDF


ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation

摘要

A user can be represented as what he/she does along the history. A common way to deal with the user modeling problem is to manually extract all kinds of aggregated features over the heterogeneous behaviors, which may fail to fully represent the data itself due to limited human instinct. Recent works usually use RNN-based methods to give an overall embedding of a behavior sequence, which then could be exploited by the downstream applications. However, this can only preserve very limited information, or aggregated memories of a person. When a downstream application requires to facilitate the modeled user features, it may lose the integrity of the specific highly correlated behavior of the user, and introduce noises derived from unrelated behaviors. This paper proposes an attention based user behavior modeling framework called ATRank, which we mainly use for recommendation tasks. Heterogeneous user behaviors are considered in our model that we project all types of behaviors into multiple latent semantic spaces, where influence can be made among the behaviors via self-attention. Downstream applications then can use the user behavior vectors via vanilla attention. Experiments show that ATRank can achieve better performance and faster training process. We further explore ATRank to use one unified model to predict different types of user behaviors at the same time, showing a comparable performance with the highly optimized individual models.

关键词

User Modeling; Attention Model; Recommendation;

全文

PDF


Budget-Constrained Multi-Armed Bandits With Multiple Plays

摘要

We study the multi-armed bandit problem with multiple plays and a budget constraint for both the stochastic and the adversarial setting. At each round, exactly K out of N possible arms have to be played (with 1 ≤ K <= N). In addition to observing the individual rewards for each arm played, the player also learns a vector of costs which has to be covered with an a-priori defined budget B. The game ends when the sum of current costs associated with the played arms exceeds the remaining budget. Firstly, we analyze this setting for the stochastic case, for which we assume each arm to have an underlying cost and reward distribution with support [cmin, 1] and [0, 1], respectively. We derive an Upper Confidence Bound (UCB) algorithm which achieves O(NK4 log B) regret. Secondly, for the adversarial case in which the entire sequence of rewards and costs is fixed in advance, we derive an upper bound on the regret of order O(√NB log(N/K)) utilizing an extension of the well-known Exp3 algorithm. We also provide upper bounds that hold with high probability and a lower bound of order Ω((1 – K/N) √NB/K).

关键词

Multi-Armed Bandits, Online Learning, Machine Learning

全文

PDF


Rocket Launching: A Universal and Efficient Framework for Training Well-Performing Light Net

摘要

Models applied on real time response tasks, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In order to get neural networks of better performance given the time limitations, we propose a universal framework that exploits a booster net to help train the lightweight net for prediction. We dub the whole process rocket launching, where the booster net is used to guide the learning of our light net throughout the whole training process. We analyze different loss functions aiming at pushing the light net to behave similarly to the booster net. Besides, we use one technique called gradient block to improve the performance of light net and booster net further. Experiments on benchmark datasets and real-life industrial advertisement data show the effectiveness of our proposed method.

关键词

Model Compressing; Neural Networks; Knowledge Distillation

全文

PDF


SC2Net: Sparse LSTMs for Sparse Coding

摘要

The iterative hard-thresholding algorithm (ISTA) is one of the most popular optimization solvers to achieve sparse codes. However, ISTA suffers from following problems: 1) ISTA employs non-adaptive updating strategy to learn the parameters on each dimension with a fixed learning rate. Such a strategy may lead to inferior performance due to the scarcity of diversity; 2) ISTA does not incorporate the historical information into the updating rules, and the historical information has been proven helpful to speed up the convergence. To address these challenging issues, we propose a novel formulation of ISTA (named as adaptive ISTA) by introducing a novel \textit{adaptive momentum vector}. To efficiently solve the proposed adaptive ISTA, we recast it as a recurrent neural network unit and show its connection with the well-known long short term memory (LSTM) model. With a new proposed unit, we present a neural network (termed SC2Net) to achieve sparse codes in an end-to-end manner. To the best of our knowledge, this is one of the first works to bridge the $\ell_1$-solver and LSTM, and may provide novel insights in understanding model-based optimization and LSTM. Extensive experiments show the effectiveness of our method on both unsupervised and supervised tasks.

关键词

sparse representation; LSTM; RNN; model-based optimization

全文

PDF


Adaptive Quantization for Deep Neural Network

摘要

In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large memory consumption, which may not be affordable for mobile platforms. Deep model quantization can be used for reducing the computation and memory costs of DNNs, and deploying complex DNNs on mobile equipment. In this work, we propose an optimization framework for deep model quantization. First, we propose a measurement to estimate the effect of parameter quantization errors in individual layers on the overall model prediction accuracy. Then, we propose an optimization process based on this measurement for finding optimal quantization bit-width for each layer. This is the first work that theoretically analyse the relationship between parameter quantization errors of individual layers and model accuracy. Our new quantization algorithm outperforms previous quantization optimization methods, and achieves 20-40% higher compression rate compared to equal bit-width quantization at the same model prediction accuracy.

关键词

Deep Model Compression; Deep Model Quantization

全文

PDF


Non-Parametric Outliers Detection in Multiple Time Series A Case Study: Power Grid Data Analysis

摘要

In this study we consider the problem of outlier detection with multiple co-evolving time series data. To capture both the temporal dependence and the inter-series relatedness, a multi-task non-parametric model is proposed, which can be extended to data with a broader exponential family distribution by adopting the notion of Bregman divergence. Albeit convex, the learning problem can be hard as the time series accumulate. In this regards, an efficient randomized block coordinate descent (RBCD) algorithm is proposed. The model and the algorithm is tested with a real-world application, involving outlier detection and event analysis in power distribution networks with high resolution multi-stream measurements. It is shown that the incorporation of inter-series relatedness enables the detection of system level events which would otherwise be unobservable with traditional methods.

关键词

time series; novelty detection

全文

PDF


A Spherical Hidden Markov Model for Semantics-Rich Human Mobility Modeling

摘要

We study the problem of modeling human mobility from semantic trace data, wherein each GPS record in a trace is associated with a text message that describes the user's activity. Existing methods fall short in unveiling human movement regularities for such data, because they either do not model the text data at all or suffer from text sparsity severely. We propose SHMM, a multi-modal spherical hidden Markov model for semantics-rich human mobility modeling. Under the hidden Markov assumption, SHMM models the generation process of a given trace by jointly considering the observed location, time, and text at each step of the trace. The distinguishing characteristic of SHMM is the text modeling part. We use fixed-size vector representations to encode the semantics of the text messages, and model the generation of the l2-normalized text embeddings on a unit sphere with the von Mises-Fisher (vMF) distribution. Compared with other alternatives like multi-variate Gaussian, our choice of the vMF distribution not only incurs much fewer parameters, but also better leverages the discriminative power of text embeddings in a directional metric space. The parameter inference for the vMF distribution is non-trivial since it involves functional inversion of ratios of Bessel functions. We theoretically prove, for the first time, that: 1) the classical Expectation-Maximization algorithm is able to work with vMF distributions; and 2) while closed-form solutions are hard to be obtained for the M-step, Newton's method is guaranteed to converge to the optimal solution with quadratic convergence rate. We have performed extensive experiments on both synthetic and real-life data. The results on synthetic data verify our theoretical analysis; while the results on real-life data demonstrate that SHMM learns meaningful semantics-rich mobility models, outperforms state-of-the-art mobility models for next location prediction, and incurs lower training cost.

关键词

Mobility Modeling; vMF distribution

全文

PDF


Weighted Multi-View Spectral Clustering Based on Spectral Perturbation

摘要

Considering the diversity of the views, assigning the multiviews with different weights is important to multi-view clustering. Several multi-view clustering algorithms have been proposed to assign different weights to the views. However, the existing weighting schemes do not simultaneously consider the characteristic of multi-view clustering and the characteristic of related single-view clustering. In this paper, based on the spectral perturbation theory of spectral clustering, we propose a weighted multi-view spectral clustering algorithm which employs the spectral perturbation to model the weights of the views. The proposed weighting scheme follows the two basic principles: 1) the clustering results on each view should be close to the consensus clustering result, and 2) views with similar clustering results should be assigned similar weights. According to spectral perturbation theory, the largest canonical angle is used to measure the difference between spectral clustering results. In this way, the weighting scheme can be formulated into a standard quadratic programming problem. Experimental results demonstrate the superiority of the proposed algorithm.

关键词

Multi-view Clustering; Spectral Clusteing; Spectral Perturbation

全文

PDF


Learning the Behavior of a Dynamical System Via a “20 Questions” Approach

摘要

Using a graphical discrete dynamical system to model a networked social system, the problem of inferring the behavior of the system can be formulated as the problem of learning the local functions of the dynamical system. We investigate the problem assuming an active form of interaction with the system through queries. We consider two classes of local functions (namely, symmetric and threshold functions) and two interaction modes, namely batch mode (where all the queries must be submitted together) and adaptive mode (where the set of queries submitted at a stage may rely on the answers received to previous queries). We develop complexity results and efficient heuristics that produce query sets under both query modes. We demonstrate the performance of our heuristics through experiments on over 20 well-known networks.

关键词

Graphical dynamical systems, threshold and symmetric functions, inference, batch and adaptive query modes

全文

PDF


Knowledge, Fairness, and Social Constraints

摘要

In the context of fair allocation of indivisible items, fairness concepts often compare the satisfaction of an agent to the satisfaction she would have from items that are not allocated to her: in particular, envy-freeness requires that no agent prefers the share of someone else to her own share. We argue that these notions could also be defined relative to the knowledge that an agent has on how the items that she does not receive are distributed among other agents. We define a family of epistemic notions of envy-freeness, parameterized by a social graph, where an agent observes the share of her neighbours but not of her non-neighbours. We also define an intermediate notion between envy-freeness and proportionality, also parameterized by a social graph. These weaker notions of envy-freeness are useful when seeking a fair allocation, since envy-freeness is often too strong. We position these notions with respect to known ones, thus revealing new rich hierarchies of fairness concepts. Finally, we present a very general framework that covers all the existing and many new fairness concepts.

关键词

allocation problems; fair division; envy-freeness; proportionality; social networks

全文

PDF


POMDP-Based Decision Making for Fast Event Handling in VANETs

摘要

Malicious vehicle agents broadcast fake information about traffic events and thereby undermine the benefits of vehicle-to-vehicle communication in vehicular ad-hoc networks (VANETs). Trust management schemes addressing this issue do not focus on effective/fast decision making in reacting to traffic events. We propose a Partially Observable Markov Decision Process (POMDP) based approach to balance the trade-off between information gathering and exploiting actions resulting in faster responses. Our model copes with malicious behavior by maintaining it as part of a small state space, thus is scalable for large VANETs. We also propose an algorithm to learn model parameters in a dynamic behavior setting. Experimental results demonstrate that our model can effectively balance the decision quality and response time while still being robust to sophisticated malicious attacks.

关键词

POMDP; Trust; Information sharing; Decision-making; VANETs

全文

PDF


An Ant-Based Algorithm to Solve Distributed Constraint Optimization Problems

摘要

As an important population-based algorithm, ant colony optimization (ACO) has been successfully applied into various combinatorial optimization problems. However, much existing work in ACO focuses on solving centralized problems. In this paper, we present a novel algorithm that takes the power of ants to solve Distributed Constraint Optimization Problems (DCOPs), called ACO_DCOP. In ACO_DCOP, a new mechanism that captures local benefits is proposed to compute heuristic factors and a new method that considers the cost structure of DCOPs is proposed to compute pheromone deltas appropriately. Moreover, pipelining technique is introduced to make full use of the computational capacity and improve the efficiency. In our theoretical analysis, we prove that ACO_DCOP is an anytime algorithm. Our empirical evaluation indicates that ACO_DCOP is able to find solutions of equal or significantly higher quality than state-of-the-art DCOP algorithms.

关键词

全文

PDF


Preallocation and Planning Under Stochastic Resource Constraints

摘要

Resource constraints frequently complicate multi-agent planning problems. Existing algorithms for resource-constrained, multi-agent planning problems rely on the assumption that the constraints are deterministic. However, frequently resource constraints are themselves subject to uncertainty from external influences. Uncertainty about constraints is especially challenging when agents must execute in an environment where communication is unreliable, making on-line coordination difficult. In those cases, it is a significant challenge to find coordinated allocations at plan time depending on availability at run time. To address these limitations, we propose to extend algorithms for constrained multi-agent planning problems to handle stochastic resource constraints. We show how to factorize resource limit uncertainty and use this to develop novel algorithms to plan policies for stochastic constraints. We evaluate the algorithms on a search-and-rescue problem and on a power-constrained planning domain where the resource constraints are decided by nature. We show that plans taking into account all potential realizations of the constraint obtain significantly better utility than planning for the expectation, while causing fewer constraint violations.

关键词

multi-agent planning; uncertainty; stochastic constraint

全文

PDF


Manipulative Elicitation — A New Attack on Elections with Incomplete Preferences

摘要

Lu and Boutilier proposed a novel approach based on "minimax regret" to use classical score based voting rules in the setting where preferences can be any partial (instead of complete) orders over the set of alternatives. We show here that such an approach is vulnerable to a new kind of manipulation which was not present in the classical (where preferences are complete orders) world of voting. We call this attack "manipulative elicitation." More specifically, it may be possible to (partially) elicit the preferences of the agents in a way that makes some distinguished alternative win the election who may not be a winner if we elicit every preference completely. More alarmingly, we show that the related computational task is polynomial time solvable for a large class of voting rules which includes all scoring rules, maximin, Copeland α for every α ∈ [0,1], simplified Bucklin voting rules, etc. We then show that introducing a parameter per pair of alternatives which specifies the minimum number of partial preferences where this pair of alternatives must be comparable makes the related computational task of manipulative elicitation NP-complete for all common voting rules including a class of scoring rules which includes the plurality, k-approval, k-veto, veto, and Borda voting rules, maximin, Copeland α for every α ∈ [0,1], and simplified Bucklin voting rules. Hence, in this work, we discover a fundamental vulnerability in using minimax regret based approach in partial preferential setting and propose a novel way to tackle it.

关键词

全文

PDF


Control Argumentation Frameworks

摘要

Dynamics of argumentation is the family of techniques concerned with the evolution of an argumentation framework (AF), for instance to guarantee that a given set of arguments is accepted. This work proposes Control Argumentation Frameworks (CAFs), a new approach that generalizes existing techniques, namely normal extension enforcement, by accommodating the possibility of uncertainty in dynamic scenarios. A CAF is able to deal with situations where the exact set of arguments is unknown and subject to evolution, and the existence (or direction) of some attacks is also unknown. It can be used by an agent to ensure that a set of arguments is part of one (or every) extension whatever the actual set of arguments and attacks. A QBF encoding of reasoning with CAFs provides a computational mechanism for determining whether and how this goal can be reached. We also provide some results concerning soundness and completeness of the proposed encoding as well as complexity issues.

关键词

Argumentation; Multiagent Systems

全文

PDF


Decentralised Learning in Systems With Many, Many Strategic Agents

摘要

Although multi-agent reinforcement learning can tackle systems of strategically interacting entities, it currently fails in scalability and lacks rigorous convergence guarantees. Crucially, learning in multi-agent systems can become intractable due to the explosion in the size of the state-action space as the number of agents increases. In this paper, we propose a method for computing closed-loop optimal policies in multi-agent systems that scales independently of the number of agents. This allows us to show, for the first time, successful convergence to optimal behaviour in systems with an unbounded number of interacting adaptive learners. Studying the asymptotic regime of N-player stochastic games, we devise a learning protocol that is guaranteed to converge to equilibrium policies even when the number of agents is extremely large. Our method is model-free and completely decentralised so that each agent need only observe its local state information and its realised rewards. We validate these theoretical results by showing convergence to Nash-equilibrium policies in applications from economics and control theory with thousands of strategically interacting agents.

关键词

Multi-agent Reinforcement Learning, Stochastic Games, Large Games, Mean Field Games, Reinforcement Learning

全文

PDF


Dilated FCN for Multi-Agent 2D/3D Medical Image Registration

摘要

2D/3D image registration to align a 3D volume and 2D X-ray images is a challenging problem due to its ill-posed nature and various artifacts presented in 2D X-ray images. In this paper, we propose a multi-agent system with an auto attention mechanism for robust and efficient 2D/3D image registration. Specifically, an individual agent is trained with dilated Fully Convolutional Network (FCN) to perform registration in a Markov Decision Process (MDP) by observing a local region, and the final action is then taken based on the proposals from multiple agents and weighted by their corresponding confidence levels. The contributions of this paper are threefold. First, we formulate 2D/3D registration as a MDP with observations, actions, and rewards properly defined with respect to X-ray imaging systems. Second, to handle various artifacts in 2D X-ray images, multiple local agents are employed efficiently via FCN-based structures, and an auto attention mechanism is proposed to favor the proposals from regions with more reliable visual cues. Third, a dilated FCN-based training mechanism is proposed to significantly reduce the Degree of Freedom in the simulation of registration environment, and drastically improve training efficiency by an order of magnitude compared to standard CNN-based training method. We demonstrate that the proposed method achieves high robustness on both spine cone beam Computed Tomography data with a low signal-to-noise ratio and data from minimally invasive spine surgery where severe image artifacts and occlusions are presented due to metal screws and guide wires, outperforming other state-of-the-art methods (single agent-based and optimization-based) by a large margin.

关键词

Markov Decision Process; Image Registration

全文

PDF


Strategic Coalitions With Perfect Recall

摘要

The paper proposes a bimodal logic that describes an interplay between distributed knowledge modality and coalition know-how modality. Unlike other similar systems, the one proposed here assumes perfect recall by all agents. Perfect recall is captured in the system by a single axiom. The main technical results are the soundness and the completeness theorems for the proposed logical system.

关键词

coalition; strategy; perfect recall; know-how; completeness

全文

PDF


The Role of Data-Driven Priors in Multi-Agent Crowd Trajectory Estimation

摘要

Resource constraints frequently complicate multi-agent planning problems. Existing algorithms for resource-constrained, multi-agent planning problems rely on the assumption that the constraints are deterministic. However, frequently resource constraints are themselves subject to uncertainty from external influences. Uncertainty about constraints is especially challenging when agents must execute in an environment where communication is unreliable, making on-line coordination difficult. In those cases, it is a significant challenge to find coordinated allocations at plan time depending on availability at run time. To address these limitations, we propose to extend algorithms for constrained multi-agent planning problems to handle stochastic resource constraints. We show how to factorize resource limit uncertainty and use this to develop novel algorithms to plan policies for stochastic constraints. We evaluate the algorithms on a search-and-rescue problem and on a power-constrained planning domain where the resource constraints are decided by nature. We show that plans taking into account all potential realizations of the constraint obtain significantly better utility than planning for the expectation, while causing fewer constraint violations.

关键词

Artificial intelligence; Machine Learning; Multi-agent system; Computer Simulation; Computer vision

全文

PDF


Dynamic Pricing for Reusable Resources in Competitive Market With Stochastic Demand

摘要

The market for selling reusable products (e.g., car rental, cloud services and network access resources) is growing rapidly over the last few years, where service providers maximize their revenues through setting optimal prices. While there has been lots of research on pricing optimization, existing works often ignore dynamic property of demand and the competition among providers. Thus, existing pricing solutions might be far from optimal in realistic markets. This paper provides the first study of service providers' dynamic pricing in consideration of market competition and makes three key contributions along this line. First, we propose a comprehensive model that takes into account the dynamic demand and interaction among providers, and formulate the optimal pricing policy in the competitive market as an equilibrium. Second, we propose an approximate Nash equilibrium to describe providers' behaviors, and design an efficient algorithm to compute the equilibrium which is guaranteed to converge. Third, we derive many properties of the model without any further constraints on demand functions, which can reduce the search space of policies in the algorithm. Finally, we conduct extensive experiments with different parameter settings, showing that the approximate equilibrium is very close to the Nash equilibrium and our proposed pricing policy outperforms existing strategies.

关键词

cloud computing;dynamic pricing;poisson process

全文

PDF


Social Norms of Cooperation With Costly Reputation Building

摘要

Social norms regulate actions in artificial societies, steering collective behavior towards desirable states. In real societies, social norms can solve cooperation dilemmas, constituting a key ingredient in systems of indirect reciprocity: reputations of agents are assigned following social norms that identify their actions as good or bad. This, in turn, implies that agents can discriminate between the different actions of others and that the behaviors of each agent are known to the population at large. This is only possible if the agents report their interactions. Reporting constitutes, this way, a fundamental ingredient of indirect reciprocity, as in its absence cooperation in a multiagent system may collapse. Yet, in most studies to date, reporting is assumed to be cost-free, which collides with many life situations, where reporting can easily incur a cost (costly reputation building). Here we develop a new model of indirect reciprocity that allows reputation building to be costly. We show that only two norms can sustain cooperation under costly reputation building, a feature that requires agents to be able to anticipate the reporting intentions of their opponents, depending sensitively on both the cost of reporting and the accuracy level of reporting anticipation.

关键词

Multiagent systems, Social norms, Indirect reciprocity, Cooperation, Reputations, Evolution

全文

PDF


Multiagent Connected Path Planning: PSPACE-Completeness and How to Deal With It

摘要

In the Multiagent Connected Path Planning problem (MCPP), a team of agents moving in a graph-represented environment must plan a set of start-goal joint paths which ensures global connectivity at each time step, under some communication model. The decision version of this problem asking for the existence of a plan that can be executed in at most a given number of steps is claimed to be NP-complete in the literature. The NP membership proof, however, is not detailed. In this paper, we show that, in fact, even deciding whether a feasible plan exists is a PSPACE-complete problem. Furthermore, we present three algorithms adopting different search paradigms, and we empirically show that they may efficiently obtain a feasible plan, if any exists, in different settings.

关键词

Multiagent Path Planning; Connected Path Planning; Planning on graphs

全文

PDF


Maximizing Influence in an Unknown Social Network

摘要

In many real world applications of influence maximization, practitioners intervene in a population whose social structure is initially unknown. This poses a multiagent systems challenge to act under uncertainty about how the agents are connected. We formalize this problem by introducing exploratory influence maximization, in which an algorithm queries individual network nodes (agents) to learn their links. The goal is to locate a seed set nearly as influential as the global optimum using very few queries. We show that this problem is intractable for general graphs. However, real world networks typically have community structure, where nodes are arranged in densely connected subgroups. We present the ARISEN algorithm, which leverages community structure to find an influential seed set. Experiments on real world networks of homeless youth, village populations in India, and others demonstrate ARISEN's strong empirical performance. To formally demonstrate how ARISEN exploits community structure, we prove an approximation guarantee for ARISEN on graphs drawn from the Stochastic Block Model.

关键词

Influence maximization; network sampling; stochastic block model

全文

PDF


Integrated Cooperation and Competition in Multi-Agent Decision-Making

摘要

Observing that many real-world sequential decision problems are not purely cooperative or purely competitive, we propose a new model—cooperative-competitive process (CCP)—that can simultaneously encapsulate both cooperation and competition. First, we discuss how the CCP model bridges the gap between cooperative and competitive models. Next, we investigate a specific class of group-dominant CCPs, in which agents cooperate to achieve a common goal as their primary objective, while also pursuing individual goals as a secondary objective. We provide an approximate solution for this class of problems that leverages stochastic finite-state controllers. The model is grounded in two multi-robot meeting and box-pushing domains that are implemented in simulation and demonstrated on two real robots.

关键词

Dec-POMDP; POSG; Multi-Robot Coordination and Competition; Multi-Objective Optimization; Slack

全文

PDF


Privacy-Preserving Policy Iteration for Decentralized POMDPs

摘要

We propose the first privacy-preserving approach to address the privacy issues that arise in multi-agent planning problems modeled as a Dec-POMDP. Our solution is a distributed message-passing algorithm based on trials, where the agents' policies are optimized using the cross-entropy method. In our algorithm, the agents' private information is protected using a public-key homomorphic cryptosystem. We prove the correctness of our algorithm and analyze its complexity in terms of message passing and encryption/decryption operations. Furthermore, we analyze several privacy aspects of our algorithm and show that it can preserve the agent privacy of non-neighbors, model privacy, and decision privacy. Our experimental results on several common Dec-POMDP benchmark problems confirm the effectiveness of our approach.

关键词

Decentralized POMDPs; Privacy-Preserving Planning

全文

PDF


HogRider: Champion Agent of Microsoft Malmo Collaborative AI Challenge

摘要

It has been an open challenge for self-interested agents to make optimal sequential decisions in complex multiagent systems, where agents might achieve higher utility via collaboration. The Microsoft Malmo Collaborative AI Challenge (MCAC), which is designed to encourage research relating to various problems in Collaborative AI, takes the form of a Minecraft mini-game where players might work together to catch a pig or deviate from cooperation, for pursuing high scores to win the challenge. Various characteristics, such as complex interactions among agents, uncertainties, sequential decision making and limited learning trials all make it extremely challenging to find effective strategies. We present HogRider---the champion agent of MCAC in 2017 out of 81 teams from 26 countries. One key innovation of HogRider is a generalized agent type hypothesis framework to identify the behavior model of the other agents, which is demonstrated to be robust to observation uncertainty. On top of that, a second key innovation is a novel Q-learning approach to learn effective policies against each type of the collaborating agents. Various ideas are proposed to adapt traditional Q-learning to handle complexities in the challenge, including state-action abstraction to reduce problem scale, a warm start approach using human reasoning for addressing limited learning trials, and an active greedy strategy to balance exploitation-exploration. Challenge results show that HogRider outperforms all the other teams by a significant edge, in terms of both optimality and stability.

关键词

multiagent learning; opponent modeling

全文

PDF


Effective Broad-Coverage Deep Parsing

摘要

Current semantic parsers either compute shallow representations over a wide range of input, or deeper representations in very limited domains. We describe a system that provides broad-coverage, deep semantic parsing designed to work in any domain using a core domain-general lexicon, ontology and grammar. This paper discusses how this core system can be customized for a particularly challenging domain, namely reading research papers in biology. We evaluate these customizations with some ablation experiments

关键词

Natural Language Understanding; Parsing; Deep Parsing; Semantic Parsing

全文

PDF


Faithful to the Original: Fact Aware Neural Abstractive Summarization

摘要

Unlike extractive summarization, abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. Our preliminary study reveals nearly 30% of the outputs from a state-of-the-art neural summarization system suffer from this problem. While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that faithfulness is also a vital prerequisite for a practical abstractive summarization system. To avoid generating fake facts in a summary, we leverage open information extraction and dependency parse technologies to extract actual fact descriptions from the source text. The dual-attention sequence-to-sequence framework is then proposed to force the generation conditioned on both the source text and the extracted fact descriptions. Experiments on the Gigaword benchmark dataset demonstrate that our model can greatly reduce fake summaries by 80%. Notably, the fact descriptions also bring significant improvement on informativeness since they often condense the meaning of the source text.

关键词

summarization; faithfulness; fact

全文

PDF


Syntax-Directed Attention for Neural Machine Translation

摘要

Attention mechanism, including global attention and local attention, plays a key role in neural machine translation (NMT). Global attention attends to all source words for word prediction. In comparison, local attention selectively looks at fixed-window source words. However, alignment weights for the current target word often decrease to the left and right by linear distance centering on the aligned source position and neglect syntax distance constraints. In this paper, we extend the local attention with syntax-distance constraint, which focuses on syntactically related source words with the predicted target word to learning a more effective context vector for predicting translation. Moreover, we further propose a double context NMT architecture, which consists of a global context vector and a syntax-directed context vector from the global attention, to provide more translation performance for NMT from source representation. The experiments on the large-scale Chinese-to-English and English-to-German translation tasks show that the proposed approach achieves a substantial and significant improvement over the baseline system.

关键词

Syntax Distance Constraint; Syntax-Directed Attention; Neural Machine Translation

全文

PDF


A Semantic QA-Based Approach for Text Summarization Evaluation

摘要

Many Natural Language Processing and Computational Linguistics applications involve the generation of new texts based on some existing texts, such as summarization, text simplification and machine translation. However, there has been a serious problem haunting these applications for decades, that is, how to automatically and accurately assess quality of these applications. In this paper, we will present some preliminary results on one especially useful and challenging problem in NLP system evaluation---how to pinpoint content differences of two text passages (especially for large passages such as articles and books). Our idea is intuitive and very different from existing approaches. We treat one text passage as a small knowledge base, and ask it a large number of questions to exhaustively identify all content points in it. By comparing the correctly answered questions from two text passages, we will be able to compare their content precisely. The experiment using 2007 DUC summarization corpus clearly shows promising results.

关键词

text summarization evaluation; question answering

全文

PDF


Learning Sentiment-Specific Word Embedding via Global Sentiment Representation

摘要

Context-based word embedding learning approaches can model rich semantic and syntactic information. However, it is problematic for sentiment analysis because the words with similar contexts but opposite sentiment polarities, such as good and bad, are mapped into close word vectors in the embedding space. Recently, some sentiment embedding learning methods have been proposed, but most of them are designed to work well on sentence-level texts. Directly applying those models to document-level texts often leads to unsatisfied results. To address this issue, we present a sentiment-specific word embedding learning architecture that utilizes local context informationas well as global sentiment representation. The architecture is applicable for both sentence-level and document-level texts. We take global sentiment representation as a simple average of word embeddings in the text, and use a corruption strategy as a sentiment-dependent regularization. Extensive experiments conducted on several benchmark datasets demonstrate that the proposed architecture outperforms the state-of-the-art methods for sentiment classification.

关键词

Sentiment word embedding

全文

PDF


Knowledge Graph Embedding With Iterative Guidance From Soft Rules

摘要

Embedding knowledge graphs (KGs) into continuous vector spaces is a focus of current research. Combining such an embedding model with logic rules has recently attracted increasing attention. Most previous attempts made a one-time injection of logic rules, ignoring the interactive nature between embedding learning and logical inference. And they focused only on hard rules, which always hold with no exception and usually require extensive manual effort to create or validate. In this paper, we propose Rule-Guided Embedding (RUGE), a novel paradigm of KG embedding with iterative guidance from soft rules. RUGE enables an embedding model to learn simultaneously from 1) labeled triples that have been directly observed in a given KG, 2) unlabeled triples whose labels are going to be predicted iteratively, and 3) soft rules with various confidence levels extracted automatically from the KG. In the learning process, RUGE iteratively queries rules to obtain soft labels for unlabeled triples, and integrates such newly labeled triples to update the embedding model. Through this iterative procedure, knowledge embodied in logic rules may be better transferred into the learned embeddings. We evaluate RUGE in link prediction on Freebase and YAGO. Experimental results show that: 1) with rule knowledge injected iteratively, RUGE achieves significant and consistent improvements over state-of-the-art baselines; and 2) despite their uncertainties, automatically extracted soft rules are highly beneficial to KG embedding, even those with moderate confidence levels. The code and data used for this paper can be obtained from https://github.com/iieir-km/RUGE.

关键词

全文

PDF


280 Birds With One Stone: Inducing Multilingual Taxonomies From Wikipedia Using Character-Level Classification

摘要

We propose a novel fully-automated approach towards inducing multilingual taxonomies from Wikipedia. Given an English taxonomy, our approach first leverages the interlanguage links of Wikipedia to automatically construct training datasets for the isa relation in the target language. Character-level classifiers are trained on the constructed datasets, and used in an optimal path discovery framework to induce high-precision, high-coverage taxonomies in other languages. Through experiments, we demonstrate that our approach significantly outperforms the state-of-the-art, heuristics-heavy approaches for six languages. As a consequence of our work, we release presumably the largest and the most accurate multilingual taxonomic resource spanning over 280 languages.

关键词

taxonomy induction; multilinguality; Wikipedia

全文

PDF


Neural Knowledge Acquisition via Mutual Attention Between Knowledge Graph and Text

摘要

We propose a general joint representation learning framework for knowledge acquisition (KA) on two tasks, knowledge graph completion (KGC) and relation extraction (RE) from text. In this framework, we learn representations of knowledge graphs (KGs) and text within a unified parameter sharing semantic space. To achieve better fusion, we propose an effective mutual attention between KGs and text. The reciprocal attention mechanism enables us to highlight important features and perform better KGC and RE. Different from conventional joint models, no complicated linguistic analysis or strict alignments between KGs and text are required to train our models. Experiments on relation extraction and entity link prediction show that models trained under our joint framework are significantly improved in comparison with other baselines. Most existing methods for KGC and RE can be easily integrated into our framework due to its flexible architectures. The source code of this paper can be obtained from https://github.com/thunlp/JointNRE.

关键词

Joint Learning; Mutual Attention; Knowledge Graph Completion; Relation Extraction

全文

PDF


FEEL: Featured Event Embedding Learning

摘要

Statistical script learning is an effective way to acquire world knowledge which can be used for commonsense reasoning. Statistical script learning induces this knowledge by observing event sequences generated from texts. The learned model thus can predict subsequent events, given earlier events. Recent approaches rely on learning event embeddings which capture script knowledge. In this work, we suggest a general learning model–Featured Event Embedding Learning (FEEL)–for injecting event embeddings with fine grained information. In addition to capturing the dependencies between subsequent events, our model can take into account higher level abstractions of the input event which help the model generalize better and account for the global context in which the event appears. We evaluated our model over three narrative cloze tasks, and showed that our model is competitive with the most recent state-of-the-art. We also show that our resulting embedding can be used as a strong representation for advanced semantic tasks such as discourse parsing and sentence semantic relatedness.

关键词

Natural Language Processing; Event Embeddings; Common Sense Inference; Statistical Script Learning; Representation Learning

全文

PDF


Linguistic Properties Matter for Implicit Discourse Relation Recognition: Combining Semantic Interaction, Topic Continuity and Attribution

摘要

Modern solutions for implicit discourse relation recognition largely build universal models to classify all of the different types of discourse relations. In contrast to such learning models, we build our model from first principles, analyzing the linguistic properties of the individual top-level Penn Discourse Treebank (PDTB) styled implicit discourse relations: Comparison, Contingency and Expansion. We find semantic characteristics of each relation type and two cohesion devices---topic continuity and attribution---work together to contribute such linguistic properties. We encode those properties as complex features and feed them into a NaiveBayes classifier, bettering baselines(including deep neural network ones) to achieve a new state-of-the-art performance level. Over a strong, feature-based baseline, our system outperforms one-versus-other binary classification by 4.83% for Comparison relation, 3.94% for Contingency and 2.22% for four-way classification.

关键词

natural language processing; linguistics; discourse relation; feature-based model

全文

PDF


Actionable Email Intent Modeling With Reparametrized RNNs

摘要

Emails in the workplace are often intentional calls to action for its recipients. We propose to annotate these emails for what action its recipient will take. We argue that our approach of action-based annotation is more scalable and theory-agnostic than traditional speech-act-based email intent annotation, while still carrying important semantic and pragmatic information. We show that our action-based annotation scheme achieves good inter-annotator agreement. We also show that we can leverage threaded messages from other domains, which exhibit comparable intents in their conversation, with domain adaptive RAINBOW (Recurrently AttentIve Neural Bag-Of-Words). On a collection of datasets consisting of IRC, Reddit, and email, our reparametrized RNNs outperform common multitask/multidomain approaches on several speech act related tasks. We also experiment with a minimally supervised scenario of email recipient action classification, and find the reparametrized RNNs learn a useful representation.

关键词

domain adaption email multitask multidomain

全文

PDF


Event Detection via Gated Multilingual Attention Mechanism

摘要

Identifying event instance in text plays a critical role in building NLP applications such as Information Extraction (IE) system. However, most existing methods for this task focus only on monolingual clues of a specific language and ignore the massive information provided by other languages. Data scarcity and monolingual ambiguity hinder the performance of these monolingual approaches. In this paper, we propose a novel multilingual approach---dubbed as Gated Multilingual Attention (GMLATT) framework---to address the two issues simultaneously. In specific, to alleviate data scarcity problem, we exploit the consistent information in multilingual data via context attention mechanism. Which takes advantage of the consistent evidence in multilingual data other than learning only from monolingual data. To deal with monolingual ambiguity problem, we propose gated cross-lingual attention to exploit the complement information conveyed by multilingual data, which is helpful for the disambiguation. The cross-lingual attention gate serves as a sentinel modelling the confidence of the clues provided by other languages and controls the information integration of various languages. We have conducted extensive experiments on the ACE 2005 benchmark. Experimental results show that our approach significantly outperforms state-of-the-art methods.

关键词

event detection; attention mechanism; deep learning

全文

PDF


Improving Sequence-to-Sequence Constituency Parsing

摘要

Sequence-to-sequence constituency parsing casts the tree structured prediction problem as a general sequential problem by top-down tree linearization,and thus it is very easy to train in parallel with distributed facilities. Despite its success, it relies on a probabilistic attention mechanism for a general purpose, which can not guarantee the selected context to be informative in the specific parsing scenario. Previous work introduced a deterministic attention to select the informative context for sequence-to-sequence parsing, but it is based on the bottom-up linearization even if it was observed that top-down linearization is better than bottom-up linearization for standard sequence-to-sequence constituency parsing. In this paper, we thereby extend the deterministic attention to directly conduct on the top-down tree linearization. Intensive experiments show that our parser delivers substantial improvements over the bottom-up linearization in accuracy, and it achieves 92.3 Fscore on the Penn English Treebank section 23 and 85.4 Fscore on the Penn Chinese Treebank test dataset, without reranking or semi-supervised training.

关键词

全文

PDF


Table-to-Text Generation by Structure-Aware Seq2seq Learning

摘要

Table-to-text generation aims to generate a description for a factual table which can be viewed as a set of field-value records. To encode both the content and the structure of a table, we propose a novel structure-aware seq2seq architecture which consists of field-gating encoder and description generator with dual attention. In the encoding phase, we update the cell memory of the LSTM unit by a field gate and its corresponding field value in order to incorporate field information into table representation. In the decoding phase, dual attention mechanism which contains word level attention and field level attention is proposed to model the semantic relevance between the generated description and the table. We conduct experiments on the WIKIBIO dataset which contains over 700k biographies and corresponding infoboxes from Wikipedia. The attention visualizations and case studies show that our model is capable of generating coherent and informative descriptions based on the comprehensive understanding of both the content and the structure of a table. Automatic evaluations also show our model outperforms the baselines by a great margin. Code for this work is available on https://github.com/tyliupku/wiki2bio.

关键词

Table-to-text; Structure-aware Seq2seq

全文

PDF


Never Retreat, Never Retract: Argumentation Analysis for Political Speeches

摘要

In this work, we apply argumentation mining techniques, in particular relation prediction, to study political speeches in monological form, where there is no direct interaction between opponents. We argue that this kind of technique can effectively support researchers in history, social and political sciences, which must deal with an increasing amount of data in digital form and need ways to automatically extract and analyse argumentation patterns. We test and discuss our approach based on the analysis of documents issued by R. Nixon and J. F. Kennedy during 1960 presidential campaign. We rely on a supervised classifier to predict argument relations (i.e., support and attack), obtaining an accuracy of 0.72 on a dataset of 1,462 argument pairs. The application of argument mining to such data allows not only to highlight the main points of agreement and disagreement between the candidates' arguments over the campaign issues such as Cuba, disarmament and health-care, but also an in-depth argumentative analysis of the respective viewpoints on these topics.

关键词

argument mining; political speeches; support and attack classification

全文

PDF


AMR Parsing With Cache Transition Systems

摘要

In this paper, we present a transition system that generalizes transition-based dependency parsing techniques to generateAMR graphs rather than tree structures. In addition to a buffer and a stack, we use a fixed-size cache, and allow the system to build arcs to any vertices present in the cache at the same time. The size of the cache provides a parameter that can trade off between the complexity of the graphs that can be built and the ease of predicting actions during parsing. Our results show that a cache transition system can cover almost all AMR graphs with a small cache size, and our end-to-end system achieves competitive results in comparison with other transition-based approaches for AMR parsing.

关键词

AMR parsing; cache transition system; semantic parsing

全文

PDF


Early Syntactic Bootstrapping in an Incremental Memory-Limited Word Learner

摘要

It has been suggested that early human word learning occurs across learning situations and is bootstrapped by syntactic regularities such as word order. Simulation results from ideal learners and models assuming prior access to structured syn-tactic and semantic representations suggest that it is possible to jointly acquire word order and meanings and that learning is improved as each language capability bootstraps the other.We first present a probabilistic framework for early syntactic bootstrapping in the absence of advanced structured representations, then we use our framework to study the utility of joint acquisition of word order and word referent and its onset, in a memory-limited incremental model. Comparing learning results in the presence and absence of joint acquisition of word order in different ambiguous contexts, improvement in word order results showed an immediate onset, starting in early trials while being affected by context ambiguity. Improvement in word learning results on the other hand, was hindered in early trials where the acquired word order was imperfect,while being facilitated by word order learning in future trials as the acquired word order improved. Furthermore, our results showed that joint acquisition of word order and word referent facilitates one-shot learning of new words as well as inferring intentions of the speaker in ambiguous contexts.

关键词

Cross-Situational Word Learning, Incremental Learning Algorithms, Memory-Limited, Syntactic Bootstrapping, Infant Word Learning, Online Learning, Joint Acquisition, Word Order Learning

全文

PDF


Recognizing and Justifying Text Entailment Through Distributional Navigation on Definition Graphs

摘要

Text entailment, the task of determining whether a piece of text logically follows from another piece of text, has become an important component for many natural language processing tasks, such as question answering and information retrieval. For entailments requiring world knowledge, most systems still work as a "black box," providing a yes/no answer that doesn't explain the reasoning behind it. We propose an interpretable text entailment approach that, given a structured definition graph, uses a navigation algorithm based on distributional semantic models to find a path in the graph which links text and hypothesis. If such path is found, it is used to provide a human-readable justification explaining why the entailment holds. Experiments show that the proposed approach present results comparable to some well-established entailment algorithms, while also meeting Explainable AI requirements, supplying clear explanations which allow the inference model interpretation.

关键词

Text Entailment; Definition Graph; Distributional Navigation; Interpretability

全文

PDF


SPINE: SParse Interpretable Neural Embeddings

摘要

Prediction without justification has limited utility. Much of the success of neural models can be attributed to their ability to learn rich, dense and expressive representations. While these representations capture the underlying complexity and latent trends in the data, they are far from being interpretable. We propose a novel variant of denoising k-sparse autoencoders that generates highly efficient and interpretable distributed word representations (word embeddings), beginning with existing word representations from state-of-the-art methods like GloVe and word2vec. Through large scale human evaluation, we report that our resulting word embedddings are much more interpretable than the original GloVe and word2vec embeddings. Moreover, our embeddings outperform existing popular word embeddings on a diverse suite of benchmark downstream tasks.

关键词

interpretability; representation learning; word embeddings; autoencoder

全文

PDF


Deep Semantic Role Labeling With Self-Attention

摘要

Semantic Role Labeling (SRL) is believed to be a crucial step towards natural language understanding and has been widely studied. Recent years, end-to-end SRL with recurrent neural networks (RNN) has gained increasing attention. However, it remains a major challenge for RNNs to handle structural information and long range dependencies. In this paper, we present a simple and effective architecture for SRL which aims to address these problems. Our model is based on self-attention which can directly capture the relationships between two tokens regardless of their distance. Our single model achieves F1=83.4 on the CoNLL-2005 shared task dataset and F1=82.7 on the CoNLL-2012 shared task dataset, which outperforms the previous state-of-the-art results by 1.8 and 1.0 F1 score respectively. Besides, our model is computationally efficient, and the parsing speed is 50K tokens per second on a single Titan X GPU.

关键词

全文

PDF


Translating Pro-Drop Languages With Reconstruction Models

摘要

Pronouns are frequently omitted in pro-drop languages, such as Chinese, generally leading to significant challenges with respect to the production of complete translations. To date, very little attention has been paid to the dropped pronoun (DP) problem within neural machine translation (NMT). In this work, we propose a novel reconstruction-based approach to alleviating DP translation problems for NMT models. Firstly, DPs within all source sentences are automatically annotated with parallel information extracted from the bilingual training corpus. Next, the annotated source sentence is reconstructed from hidden representations in the NMT model. With auxiliary training objectives, in the terms of reconstruction scores, the parameters associated with the NMT model are guided to produce enhanced hidden representations that are encouraged as much as possible to embed annotated DP information. Experimental results on both Chinese-English and Japanese-English dialogue translation tasks show that the proposed approach significantly and consistently improves translation performance over a strong NMT baseline, which is directly built on the training data annotated with DPs.

关键词

Neural Machine Translation; Pro-Drop Language; Dropped Pronoun; Reconstruction Model; Dialogue

全文

PDF


Event Representations With Tensor-Based Compositions

摘要

Robust and flexible event representations are important to many core areas in language understanding. Scripts were proposed early on as a way of representing sequences of events for such understanding, and has recently attracted renewed attention. However, obtaining effective representations for modeling script-like event sequences is challenging. It requires representations that can capture event-level and scenario-level semantics. We propose a new tensor-based composition method for creating event representations. The method captures more subtle semantic interactions between an event and its entities and yields representations that are effective at multiple event-related tasks. With the continuous representations, we also devise a simple schema generation method which produces better schemas compared to a prior discrete representation based method. Our analysis shows that the tensors capture distinct usages of a predicate even when there are only subtle differences in their surface realizations.

关键词

Event; Script; Schema

全文

PDF


Does William Shakespeare REALLY Write Hamlet? Knowledge Representation Learning With Confidence

摘要

Knowledge graphs (KGs), which could provide essential relational information between entities, have been widely utilized in various knowledge-driven applications. Since the overall human knowledge is innumerable that still grows explosively and changes frequently, knowledge construction and update inevitably involve automatic mechanisms with less human supervision, which usually bring in plenty of noises and conflicts to KGs. However, most conventional knowledge representation learning methods assume that all triple facts in existing KGs share the same significance without any noises. To address this problem, we propose a novel confidence-aware knowledge representation learning framework (CKRL), which detects possible noises in KGs while learning knowledge representations with confidence simultaneously. Specifically, we introduce the triple confidence to conventional translation-based methods for knowledge representation learning. To make triple confidence more flexible and universal, we only utilize the internal structural information in KGs, and propose three kinds of triple confidences considering both local and global structural information. In experiments, We evaluate our models on knowledge graph noise detection, knowledge graph completion and triple classification. Experimental results demonstrate that our confidence-aware models achieve significant and consistent improvements on all tasks, which confirms the capability of CKRL modeling confidence with structural information in both KG noise detection and knowledge representation learning.

关键词

knowledge graph; confidence; knowledge representation learning; representation learning

全文

PDF


Multi-Channel Encoder for Neural Machine Translation

摘要

Attention-based Encoder-Decoder has the effective architecture for neural machine translation (NMT), which typically relies on recurrent neural networks (RNN) to build the blocks that will be lately called by attentive reader during the decoding process. This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN. On the other hand, we often hope the decoder to take pieces of source sentence at varying levels suiting its own linguistic structure: for example, we may want to take the entity name in its raw form while taking an idiom as a perfectly composed unit. Motivated by this demand, we propose Multi-channel Encoder (MCE), which enhances encoding components with different levels of composition. More specifically, in addition to the hidden state of encoding RNN, MCE takes 1) the original word embedding for raw encoding with no composition, and 2) a particular design of external memory in Neural Turing Machine NTM) for more complex composition, while all three encoding strategies are properly blended during decoding. Empirical study on Chinese-English translation shows that our model can improve by 6.52 BLEU points upon a strong open source NMT system: DL4MT1. On the WMT14 English-French task, our single shallow system achieves BLEU=38.8, comparable with the state-of-the-art deep models.

关键词

NMT

全文

PDF


Augmenting End-to-End Dialogue Systems With Commonsense Knowledge

摘要

Building dialogue systems that can converse naturally with humans is a challenging yet intriguing problem of artificial intelligence. In open-domain human-computer conversation, where the conversational agent is expected to respond to human utterances in an interesting and engaging way, commonsense knowledge has to be integrated into the model effectively. In this paper, we investigate the impact of providing commonsense knowledge about the concepts covered in the dialogue. Our model represents the first attempt to integrating a large commonsense knowledge base into end-to-end conversational models. In the retrieval-based scenario, we propose a model to jointly take into account message content and related commonsense for selecting an appropriate response. Our experiments suggest that the knowledge-augmented models are superior to their knowledge-free counterparts.

关键词

Commonsense Knowledge; Dialogue Systems

全文

PDF


An Unsupervised Model With Attention Autoencoders for Question Retrieval

摘要

Question retrieval is a crucial subtask for community question answering. Previous research focus on supervised models which depend heavily on training data and manual feature engineering. In this paper, we propose a novel unsupervised framework, namely reduced attentive matching network (RAMN), to compute semantic matching between two questions. Our RAMN integrates together the deep semantic representations, the shallow lexical mismatching information and the initial rank produced by an external search engine. For the first time, we propose attention autoencoders to generate semantic representations of questions. In addition, we employ lexical mismatching to capture surface matching between two questions, which is derived from the importance of each word in a question. We conduct experiments on the open CQA datasets of SemEval-2016 and SemEval-2017. The experimental results show that our unsupervised model obtains comparable performance with the state-of-the-art supervised methods in SemEval-2016 Task 3, and outperforms the best system in SemEval-2017 Task 3 by a wide margin.

关键词

community question answering; question retrieval; attention autoencoders

全文

PDF


Sequential Copying Networks

摘要

Copying mechanism shows effectiveness in sequence-to-sequence based neural network models for text generation tasks, such as abstractive sentence summarization and question generation. However, existing works on modeling copying or pointing mechanism only considers single word copying from the source sentences. In this paper, we propose a novel copying framework, named Sequential Copying Networks (SeqCopyNet), which not only learns to copy single words, but also copies sequences from the input sentence. It leverages the pointer networks to explicitly select a sub-span from the source side to target side, and integrates this sequential copying mechanism to the generation process in the encoder-decoder paradigm. Experiments on abstractive sentence summarization and question generation tasks show that the proposed SeqCopyNet can copy meaningful spans and outperforms the baseline models.

关键词

Summarization; Question Generation

全文

PDF


Lattice Recurrent Unit: Improving Convergence and Statistical Efficiency for Sequence Modeling

摘要

Recurrent neural networks have shown remarkable success in modeling sequences. However low resource situations still adversely affect the generalizability of these models. We introduce a new family of models, called Lattice Recurrent Units (LRU), to address the challenge of learning deep multi-layer recurrent models with limited resources.  LRU models achieve this goal by creating distinct (but coupled) flow of information inside the units: a first flow along time dimension and a second flow along depth dimension. It also offers a symmetry in how information can flow horizontally and vertically.  We analyze the effects of decoupling three different components of our LRU model: Reset Gate, Update Gate and Projected State. We evaluate this family of new LRU models on computational convergence rates and statistical efficiency.Our experiments are performed on four publicly-available datasets, comparing with Grid-LSTM and Recurrent Highway networks. Our results show that LRU has better empirical computational convergence rates and statistical efficiency values, along with learning more accurate language models.

关键词

recurrent unit; sequence modeling; temporal; language model; lattice

全文

PDF


Leveraging Lexical Substitutes for Unsupervised Word Sense Induction

摘要

Word sense induction is the most prominent unsupervised approach to lexical disambiguation. It clusters word instances, typically represented by their bag-of-words contexts. Therefore, uninformative and ambiguous contexts present a major challenge. In this paper, we investigate the use of an alternative instance representation based on lexical substitutes, i.e., contextually suitable, meaning-preserving replacements. Using lexical substitutes predicted by a state-of-the-art automatic system and a simple clustering algorithm, we outperform bag-of-words instance representations and compete with much more complex structured probabilistic models. Furthermore, we show that an oracle based on manually-labeled lexical substitutes yields yet substantially higher performance. Taken together, this provides evidence for a complementarity between word sense induction and lexical substitution that has not been given much consideration before.

关键词

word sense induction; lexical substitution

全文

PDF


Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations

摘要

Using a dictionary to map independently trained word embeddings to a shared space has shown to be an effective approach to learn bilingual word embeddings. In this work, we propose a multi-step framework of linear transformations that generalizes a substantial body of previous work. The core step of the framework is an orthogonal transformation, and existing methods can be explained in terms of the additional normalization, whitening, re-weighting, de-whitening and dimensionality reduction steps. This allows us to gain new insights into the behavior of existing methods, including the effectiveness of inverse regression, and design a novel variant that obtains the best published results in zero-shot bilingual lexicon extraction. The corresponding software is released as an open source project.

关键词

cross-lingual word embeddings; bilingual word embedding mappings; bilingual lexicon extraction

全文

PDF


Table-to-Text: Describing Table Region With Natural Language

摘要

In this paper, we present a generative model to generate a natural language sentence describing a table region, e.g., a row. The model maps a row from a table to a continuous vector and then generates a natural language sentence by leveraging the semantics of a table. To deal with rare words appearing in a table, we develop a flexible copying mechanism that selectively replicates contents from the table in the output sequence. Extensive experiments demonstrate the accuracy of the model and the power of the copying mechanism. On two synthetic datasets, WIKIBIO and SIMPLEQUESTIONS, our model improves the current state-of-the-art BLEU-4 score from 34.70 to 40.26 and from 33.32 to 39.12, respectively. Furthermore, we introduce an open-domain dataset WIKITABLETEXT including 13,318 explanatory sentences for 4,962 tables. Our model achieves a BLEU-4 score of 38.23, which outperforms template based and language model based approaches.

关键词

Table-to-Text Generation; Neural Question Generation; Table-to-Sequence

全文

PDF


Learning Interpretable Spatial Operations in a Rich 3D Blocks World

摘要

In this paper, we study the problem of mapping natural language instructions to complex spatial actions in a 3D blocks world. We first introduce a new dataset that pairs complex 3D spatial operations to rich natural language descriptions that require complex spatial and pragmatic interpretations such as “mirroring”, “twisting”, and “balancing”. This dataset, built on the simulation environment of Bisk, Yuret, and Marcu (2016), attains language that is significantly richer and more complex, while also doubling the size of the original dataset in the 2D environment with 100 new world configurations and 250,000 tokens. In addition, we propose a new neural architecture that achieves competitive results while automatically discovering an inventory of interpretable spatial operations (Figure 5).  

关键词

grounding; natural language; spatial; actions

全文

PDF


Using k-Way Co-Occurrences for Learning Word Embeddings

摘要

Co-occurrences between two words provide useful insights into the semantics of those words.Consequently, numerous prior work on word embedding learning has used co-occurrences between two wordsas the training signal for learning word embeddings.However, in natural language texts it is common for multiple words to be related and co-occurring in the same context.We extend the notion of co-occurrences to cover k(≥2)-way co-occurrences among a set of k-words.Specifically, we prove a theoretical relationship between the joint probability of k(≥2) words, and the sum of l_2 norms of their embeddings. Next, we propose a learning objective motivated by our theoretical resultthat utilises k-way co-occurrences for learning word embeddings.Our experimental results show that the derived theoretical relationship does indeed hold empirically, anddespite data sparsity, for some smaller k(≤5) values, k-way embeddings perform comparably or better than 2-way embeddings in a range of tasks.

关键词

Word Embeddings; k-way co-occurrences

全文

PDF


Proposition Entailment in Educational Applications using Deep Neural Networks

摘要

The next generation of educational applications need to significantly improve the way feedback is offered to both teachers and students. Simply determining coarse-grained entailment relations between the teacher's reference answer as a whole and a student response will not be sufficient. A finer-grained analysis is needed to determine which aspects of the reference answer have been understood and which have not. To this end, we propose an approach that splits the reference answer into its constituent propositions and two methods for detecting entailment relations between each reference answer proposition and a student response. Both methods, one using hand-crafted features and an SVM and the other using word embeddings and deep neural networks, achieve significant improvements over a state-of-the-art system and two alternative approaches.

关键词

educational applications; deep neural networks; machine learning; word embeddings; entailment

全文

PDF


cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information

摘要

We propose cw2vec, a novel method for learning Chinese word embeddings. It is based on our observation that exploiting stroke-level information is crucial for improving the learning of Chinese word embeddings. Specifically, we design a minimalist approach to exploit such features, by using stroke n-grams, which capture semantic and morphological level information of Chinese words. Through qualitative analysis, we demonstrate that our model is able to extract semantic information that  cannot be captured by existing methods. Empirical results on the word similarity, word analogy, text classification and named entity recognition tasks show that the proposed approach consistently outperforms state-of-the-art approaches such as word-based word2vec and GloVe, character-based CWE, component-based JWE and pixel-based GWE.

关键词

全文

PDF


Knowledge-based Word Sense Disambiguation using Topic Models

摘要

Word Sense Disambiguation is an open problem in Natural Language Processing which is particularly challenging and useful in the unsupervised setting where all the words in any given text need to be disambiguated without using any labeled data. Typically WSD systems use the sentence or a small window of words around the target word as the context for disambiguation because their computational complexity scales exponentially with the size of the context. In this paper, we leverage the formalism of topic model to design a WSD system that scales linearly with the number of words in the context. As a result, our system is able to utilize the whole document as the context for a word to be disambiguated. The proposed method is a variant of Latent Dirichlet Allocation in which the topic proportions for a document are replaced by synset proportions. We further utilize the information in the WordNet by assigning a non-uniform prior to synset distribution over words and a logistic-normal prior for document distribution over synsets. We evaluate the proposed method on Senseval-2, Senseval-3, SemEval-2007, SemEval-2013 and SemEval-2015 English All-Word WSD datasets and show that it outperforms the state-of-the-art unsupervised knowledge-based WSD system by a significant margin.

关键词

WSD; Word Sense Disambiguation; Topic models; NLP

全文

PDF


Meta Multi-Task Learning for Sequence Modeling

摘要

Semantic composition functions have been playing a pivotal role in neural representation learning of text sequences. In spite of their success, most existing models suffer from the underfitting problem: they use the same shared compositional function on all the positions in the sequence, thereby lacking expressive power due to incapacity to capture the richness of compositionality. Besides, the composition functions of different tasks are independent and learned from scratch. In this paper, we propose a new sharing scheme of composition function across multiple tasks. Specifically, we use a shared meta-network to capture the meta-knowledge of semantic composition and generate the parameters of the task-specific semantic composition models. We conduct extensive experiments on two types of tasks, text classification and sequence tagging, which demonstrate the benefits of our approach. Besides, we show that the shared meta-knowledge learned by our proposed model can be regarded as off-the-shelf knowledge and easily transferred to new tasks.

关键词

multi-task learning; nature language process; deep learning

全文

PDF


IMS-DTM: Incremental Multi-Scale Dynamic Topic Models

摘要

Dynamic topic models (DTM) are commonly used for mining latent topics in evolving web corpora. In this paper, we note that a major limitation of the conventional DTM based models is that they assume a predetermined and fixed scale of topics. In reality, however, topics may have varying spans and topics of multiple scales can co-exist in a single web or social media data stream. Therefore, DTMs that assume a fixed epoch length may not be able to effectively capture latent topics and thus negatively affect accuracy. In this paper, we propose a Multi-Scale Dynamic Topic Model (MS-DTM) and a complementary Incremental Multi-Scale Dynamic Topic Model (IMS-DTM) inference method that can be used to capture latent topics and their dynamics simultaneously, at different scales. In this model, topic specific feature distributions are generated based on a multi-scale feature distribution of the previous epochs; moreover, multiple scales of the current epoch are analyzed together through a novel multi-scale incremental Gibbs sampling technique. We show that the proposed model significantly improves efficiency and effectiveness compared to the single scale dynamic DTMs and prior models that consider only multiple scales of the past.

关键词

Incremental Analysis; Dynamic Topic Models; Gibbs Sampling; Multi-Scale

全文

PDF


Zero-Resource Neural Machine Translation with Multi-Agent Communication Game

摘要

While end-to-end neural machine translation (NMT) has achieved notable success in the past years in translating a handful of resource-rich language pairs, it still suffers from the data scarcity problem for low-resource language pairs and domains. To tackle this problem, we propose an interactive multimodal framework for zero-resource neural machine translation. Instead of being passively exposed to large amounts of parallel corpora, our learners (implemented as encoder-decoder architecture) engage in cooperative image description games, and thus develop their own image captioning or neural machine translation model from the need to communicate in order to succeed at the game. Experimental results on the IAPR-TC12 and Multi30K datasets show that the proposed learning mechanism significantly improves over the state-of-the-art methods.

关键词

NMT; zero-resource; multimodal

全文

PDF


Learning to Compose Task-Specific Tree Structures

摘要

For years, recursive neural networks (RvNNs) have been shown to be suitable for representing text into fixed-length vectors and achieved good performance on several natural language processing tasks. However, the main drawback of RvNNs is that they require structured input, which makes data preparation and model implementation hard. In this paper, we propose Gumbel Tree-LSTM, a novel tree-structured long short-term memory architecture that learns how to compose task-specific tree structures only from plain text data efficiently. Our model uses Straight-Through Gumbel-Softmax estimator to decide the parent node among candidates dynamically and to calculate gradients of the discrete decision. We evaluate the proposed model on natural language inference and sentiment analysis,  and show that our model outperforms or is at least comparable to previous models. We also find that our model converges significantly faster than other models.

关键词

recursive neural network; tree-lstm; gumbel-softmax; unsupervised structure learning

全文

PDF


Geometric Relationship between Word and Context Representations

摘要

Pre-trained distributed word representations have been proven to be useful in various natural language processing (NLP) tasks. However, the geometric basis of word representations and their relations to the representations of word's contexts has not been carefully studied yet. In this study, we first investigate such geometric relationship under a general framework, which is abstracted from some typical word representation learning approaches, and find out that only the directions of word representations are well associated to their context vector representations while the magnitudes are not. In order to make better use of the information contained in the magnitudes of word representations, we propose a hierarchical Gaussian model combined with maximum a posteriori estimation to learn word representations, and extend it to represent polysemous words. Our word representations have been evaluated on multiple NLP tasks, and the experimental results show that the proposed model achieved promising results, comparing to several popular word representations.

关键词

全文

PDF


A Knowledge-Grounded Neural Conversation Model

摘要

Neural network models are capable of generating extremely natural sounding conversational interactions.  However, these models have been mostly applied to casual scenarios (e.g., as “chatbots”) and have yet to demonstrate they can serve in more useful conversational applications. This paper presents a novel, fully data-driven, and knowledge-grounded neural conversation model aimed at producing more contentful responses.  We generalize the widely-used Sequence-to-Sequence (Seq2Seq) approach by conditioning responses on both conversation history and external “facts”, allowing the model to be versatile and applicable in an open-domain setting.  Our approach yields significant improvements over a competitive Seq2Seq baseline. Human judges found that our outputs are significantly more informative.

关键词

NLP; dialogue; conversation models; generation; deep learning

全文

PDF


Learning to Predict Readability Using Eye-Movement Data From Natives and Learners

摘要

Readability assessment can improve the quality of assisting technologies aimed at language learners. Eye-tracking data has been used for both inducing and evaluating general-purpose NLP/AI models, and below we show that unsurprisingly, gaze data from language learners can also improve multi-task readability assessment models. This is unsurprising, since the gaze data records the reading difficulties ofthe learners. Unfortunately, eye-tracking data from language learners is often much harder to obtain than eye-tracking data from native speakers. We therefore compare the performance of deep learning readability models that use nativespeaker eye movement data to models using data from language learners. Somewhat surprisingly, we observe no significant drop in performance when replacing learners with natives, making approaches that rely on native speaker gaze information, more scalable. In other words, our finding is that language learner difficulties can be efficiently estimated from native speakers, which suggests that, more generally, readily available gaze data can be used to improve educational NLP/AI models targeted towards language learners.

关键词

readability, eye movements, NLP, Machine Learning

全文

PDF


Neural Machine Translation with Gumbel-Greedy Decoding

摘要

Previous neural machine translation models used some heuristic search algorithms (e.g., beam search) in order to avoid solving the maximum a posteriori problem over translation sentences at test phase. In this paper, we propose the \textit{Gumbel-Greedy Decoding} which trains a generative network to predict translation under a trained model. We solve such a problem using the Gumbel-Softmax reparameterization, which makes our generative network differentiable and trainable through standard stochastic gradient methods. We empirically demonstrate that our proposed model is effective for generating sequences of discrete words.

关键词

Machine Translation;Gumbel Softmax;Greedy Decoding;Generator Discriminator

全文

PDF


Search Engine Guided Neural Machine Translation

摘要

In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training. The proposed approach consists of two stages. In the first stage –retrieval stage–, an off-the-shelf, black-box search engine is used to retrieve a small subset of sentence pairs from a training set given a source sentence. These pairs are further filtered based on a fuzzy matching score based on edit distance. In the second stage–translation stage–, a novel translation model, called search engine guided NMT (SEG-NMT), seamlessly uses both the source sentence and a set of retrieved sentence pairs to perform the translation. Empirical evaluation on three language pairs (En-Fr, En-De, and En-Es) shows that the proposed approach significantly outperforms the baseline approach and the improvement is more significant when more relevant sentence pairs were retrieved.

关键词

Machine Translation;Search Engine; Non-Parametric; Translation Memory

全文

PDF


Long Text Generation via Adversarial Training with Leaked Information

摘要

Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets(GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation. However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process. As such, it limits its success when the length of the generated text samples is long (more than 20 words). In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation. We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance. The generator incorporates such informative signals into all generation steps through an additional MANAGER module, which takes the extracted features of current generated words and outputs a latent vector to guide the WORKER module for next-word generation.Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios. More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between MANAGER and WORKER.

关键词

Text Generation;GAN;RL

全文

PDF


A Deep Generative Framework for Paraphrase Generation

摘要

Paraphrase generation is an important problem in NLP, especially in question answering, information retrieval, information extraction, conversation systems, to name a few. In this paper, we address the problem of generating paraphrases automatically. Our proposed method is based on a combination of deep generative models (VAE) with sequence-to-sequence models (LSTM) to generate paraphrases, given an input sentence. Traditional VAEs when combined with recurrent neural networks can generate free text but they are not suitable for paraphrase generation for a given sentence. We address this problem by conditioning the both, encoder and decoder sides of VAE, on the original sentence, so that it can generate the given sentence's paraphrases. Unlike most existing models, our model is simple, modular and can generate multiple paraphrases, for a given sentence. Quantitative evaluation of the proposed method on a benchmark paraphrase dataset demonstrates its efficacy, and its performance improvement over the state-of-the-art methods by a significant margin, whereas qualitative human evaluation indicate that the generated paraphrases are well-formed, grammatically correct, and are relevant to the input sentence. Furthermore, we evaluate our method on a newly released question paraphrase dataset, and establish a new baseline for future research.

关键词

Paraphrase generation, variational autoencoders, question paraphrase

全文

PDF


Placing Objects in Gesture Space: Toward Incremental Interpretation of Multimodal Spatial Descriptions

摘要

When describing routes not in the current environment, a common strategy is to anchor the description in configurations of salient landmarks, complementing the verbal descriptions by "placing" the non-visible landmarks in the gesture space.  Understanding such multimodal descriptions and later locating the landmarks from real world is a challenging task for the hearer, who must interpret speech and gestures in parallel, fuse information from both modalities, build a mental representation of the description, and ground the knowledge to real world landmarks.  In this paper, we model the hearer's task, using a multimodal spatial description corpus we collected.  To reduce the variability of verbal descriptions, we simplified the setup to use simple objects as landmarks.  We describe a real-time system to  evaluate the separate and joint contribution of the modalities. We show that gestures not only help to improve the overall system performance, even if to a large extent they encode redundant information, but also result in earlier final correct interpretations. Being able to build and apply representations incrementally will be of use in more dialogical settings, we argue, where it can enable immediate clarification in cases of mismatch.

关键词

Co-verbal gestures; abstract deixis; real time system; incremental processing; multimodal interface

全文

PDF


Jointly Parse and Fragment Ungrammatical Sentences

摘要

This paper is about detecting incorrect arcs in a dependency parse for sentences that contain grammar mistakes. Pruning these arcs results in well-formed parse fragments that can still be useful for downstream applications. We propose two automatic methods that jointly parse the ungrammatical sentence and prune the incorrect arcs: a parser retrained on a parallel corpus of ungrammatical sentences with their corrections, and a sequence-to-sequence method. Experimental results show that the proposed strategies are promising for detecting incorrect syntactic dependencies as well as incorrect semantic dependencies.

关键词

Parse Tree Fragmentation;Dependency Arc Pruning;Ungrammatical Sentences;Semantic Role Labeling

全文

PDF


Persuasive Influence Detection: The Role of Argument Sequencing

摘要

Automatic detection of persuasion in online discussion is key to understanding how social media is used. Predicting persuasiveness is difficult, however, due to the need to model world knowledge, dialogue, and sequential reasoning. We focus on modeling the sequence of arguments in social media posts using neural models with embeddings for words, discourse relations, and semantic frames. We demonstrate significant improvement over prior work in detecting successful arguments. We also present an error analysis assessing novice human performance at predicting persuasiveness.

关键词

全文

PDF


An Interpretable Generative Adversarial Approach to Classification of Latent Entity Relations in Unstructured Sentences

摘要

We propose a generative adversarial neural network model for relation classification that attempts to emulate the way in which human analysts might process sentences. Our approach provides two unique benefits over existing capabilities: (1) we make predictions by finding and exploiting supportive rationales to improve interpretability (i.e. words or phrases extracted from a sentence that a person can reason upon), and (2) we allow predictions to be easily corrected by adjusting the rationales.Our model consists of three stages: Generator, Selector, and Encoder. The Generator identifies candidate text fragments; the Selector decides which fragments can be used as rationales depending on the goal; and finally, the Encoder performs relation reasoning on the rationales. While the Encoder is trained in a supervised manner to classify relations, the Generator and Selector are designed as unsupervised models to identify rationales without prior knowledge, although they can be semi-supervised through human annotations. We evaluate our model on data from SemEval 2010 that provides 19 relation-classes. Experiments demonstrate that our approach outperforms state-of-the-art models, and that our model is capable of extracting good rationales on its own as well as benefiting from labeled rationales if provided.

关键词

Relation Classification

全文

PDF


SciTaiL: A Textual Entailment Dataset from Science Question Answering

摘要

We present a new dataset and model for textual entailment, derived from treating multiple-choice question-answering as an entailment problem. SciTail is the first entailment set that is created solely from natural sentences that already exist independently ``in the wild'' rather than sentences authored specifically for the entailment task. Different from existing entailment datasets, we create hypotheses from science questions and the corresponding answer candidates, and premises from relevant web sentences retrieved from a large corpus. These sentences are often linguistically challenging. This, combined with the high lexical similarity of premise and hypothesis for both entailed and non-entailed pairs, makes this new entailment task particularly difficult. The resulting challenge is evidenced by state-of-the-art textual entailment systems achieving mediocre performance on SciTail, especially in comparison to a simple majority class baseline. As a step forward, we demonstrate that one can improve accuracy on SciTail by 5% using a new neural model that exploits linguistic structure.

关键词

textual entailment; dataset; neural networks, structured entailment; science question answering

全文

PDF


Efficient Large-Scale Multi-Modal Classification

摘要

While the incipient internet was largely text-based, the modern digital world is becoming increasingly multi-modal. Here, we examine multi-modal classification where one modality is discrete, e.g. text, and the other is continuous, e.g. visual representations transferred from a convolutional neural network. In particular, we focus on scenarios where we have to be able to classify large quantities of data quickly. We investigate various methods for performing multi-modal fusion and analyze their trade-offs in terms of classification accuracy and computational efficiency. Our findings indicate that the inclusion of continuous information improves performance over text-only on a range of multi-modal classification tasks, even with simple fusion methods. In addition, we experiment with discretizing the continuous features in order to speed up and simplify the fusion process even further. Our results show that fusion with discretized features outperforms text-only classification, at a fraction of the computational cost of full multi-modal fusion, with the additional benefit of improved interpretability.

关键词

全文

PDF


Neural Character-level Dependency Parsing for Chinese

摘要

This paper presents a truly full character-level neural dependency parser together with a newly released character-level dependency treebank for Chinese, which has suffered a lot from the dilemma of defining word or not to model character interactions. Integrating full character-level dependencies with character embedding and human annotated character-level part-of-speech and dependency labels for the first time, we show an extra performance enhancement from the evaluation on Chinese Penn Treebank and SJTU (Shanghai Jiao Tong University) Chinese Character Dependency Treebank and the potential of better understanding deeper structure of Chinese sentences.

关键词

parsing; neural network; character based

全文

PDF


Conversational Model Adaptation via KL Divergence Regularization

摘要

In this study we formulate the problem of conversational model adaptation, where we aim to build a generative conversational model for a target domain based on a limited amount of dialogue data from this target domain and some existing dialogue models from related source domains. This model facilitates the fast building of a chatbot platform, where a new vertical chatbot with only a small number of conversation data can be supported by other related mature chatbots. Previous studies on model adaptation and transfer learning mostly focus on classification and recommendation problems, however, how these models work for conversation generation are still unexplored. To this end, we leverage a KL divergence (KLD) regularization to adapt the existing conversational models. Specifically, it employs the KLD to measure the distance between source and target domain. Adding KLD as a regularization to the objective function allows the proposed method to utilize the information from source domains effectively. We also evaluate the performance of this adaptation model for the online chatbots in Wechat platform of public accounts using both the BLEU metric and human judgement. The experiments empirically show that the proposed method visibly improves these evaluation metrics.

关键词

全文

PDF


Slim Embedding Layers for Recurrent Neural Language Models

摘要

Recurrent neural language models are the state-of-the-art models for language modeling. When the vocabulary size is large, the space taken to store the model parameters becomes the bottleneck for the use of recurrent neural language models. In this paper, we introduce a simple space compression method that randomly shares the structured parameters at both the input and output embedding layers of the recurrent neural language models to significantly reduce the size of model parameters, but still compactly represent the original input and output embedding layers. The method is easy to implement and tune. Experiments on several data sets showthat the new method can get similar perplexity and BLEU score results whileonly using a very tiny fraction of parameters.

关键词

Language Modeling; Embedding Layers

全文

PDF


Automatic Generation of Text Descriptive Comments for Code Blocks

摘要

We propose a framework to automatically generate descriptive comments for source code blocks. While this problem has been studied by many researchers previously, their methods are mostly based on fixed template and achieves poor results. Our framework does not rely on any template, but makes use of a new recursive neural network called CodeRNN to extract features from the source code and embed them into one vector. When this vector representation is input to a new recurrent neural network (Code-GRU), the overall framework generates text descriptions of the code with accuracy (Rouge-2 value) significantly higher than other learning-based approaches such as sequence-to-sequence model. The Code-RNN model can also be used in other scenario where the representation of code is required.

关键词

code comment; recursive neural network; recurrent neural network

全文

PDF


BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems

摘要

We present a new algorithm that significantly improves the efficiency of exploration for deep Q-learning agents in dialogue systems. Our agents explore via Thompson sampling, drawing Monte Carlo samples from a Bayes-by-Backprop neural network. Our algorithm learns much faster than common exploration strategies such as ε-greedy, Boltzmann, bootstrapping, and intrinsic-reward-based ones. Additionally, we show that spiking the replay buffer with experiences from just a few successful episodes can make Q-learning feasible when it might otherwise fail.

关键词

task-oriented dialogue;deep reinforcement learning;exploration;policy learning

全文

PDF


Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models

摘要

Dialog response selection is an important step towards natural response generation in conversational agents. Existing work on neural conversational models mainly focuses on offline supervised learning using a large set of context-response pairs. In this paper, we focus on online learning of response selection in retrieval-based dialog systems. We propose a contextual multi-armed bandit model with a nonlinear reward function that uses distributed representation of text for online response selection. A bidirectional LSTM is used to produce the distributed representations of dialog context and responses, which serve as the input to a contextual bandit. In learning the bandit, we propose a customized Thompson sampling method that is applied to a polynomial feature space in approximating the reward. Experimental results on the Ubuntu Dialogue Corpus demonstrate significant performance gains of the proposed method over conventional linear contextual bandits. Moreover, we report encouraging response selection performance of the proposed neural bandit model using the Recall@k metric for a small set of online training samples.

关键词

Bandit; Neural Network; Dialog; Response Selection

全文

PDF


Empower Sequence Labeling with Task-Aware Neural Language Model

摘要

Linguistic sequence labeling is a general approach encompassing a variety of problems, such as part-of-speech tagging and named entity recognition. Recent advances in neural networks (NNs) make it possible to build reliable models without handcrafted features. However, in many cases, it is hard to obtain sufficient annotations to train these models. In this study, we develop a neural framework to extract knowledge from raw texts and empower the sequence labeling task. Besides word-level knowledge contained in pre-trained word embeddings, character-aware neural language models are incorporated to extract character-level knowledge. Transfer learning techniques are further adopted to mediate different components and guide the language model towards the key knowledge. Comparing to previous methods, these task-specific knowledge allows us to adopt a more concise model and conduct more efficient training. Different from most transfer learning methods, the proposed framework does not rely on any additional supervision. It extracts knowledge from self-contained order information of training sequences. Extensive experiments on benchmark datasets demonstrate the effectiveness of leveraging character-level knowledge and the efficiency of co-training. For example, on the CoNLL03 NER task, model training completes in about 6 hours on a single GPU, reaching F_1 score of 91.71+/-0.10 without using any extra annotations.

关键词

Sequence Labeling; Lanuguage Model; Highway-networks

全文

PDF


Semantic Structure-Based Word Embedding by Incorporating Concept Convergence and Word Divergence

摘要

Representing the semantics of words is a fundamental task in text processing. Several research studies have shown that text and knowledge bases (KBs) are complementary sources for word embedding learning. Most existing methods only consider relationships within word-pairs in the usage of KBs. We argue that the structural information of well-organized words within the KBs is able to convey more effective and stable knowledge in capturing semantics of words. In this paper, we propose a semantic structure-based word embedding method, and introduce concept convergence and word divergence to reveal semantic structures in the word embedding learning process. To assess the effectiveness of our method, we use WordNet for training and conduct extensive experiments on word similarity, word analogy, text classification and query expansion. The experimental results show that our method outperforms state-of-the-art methods, including the methods trained solely on the corpus, and others trained on the corpus and the KBs.

关键词

Word embedding; nature language processing

全文

PDF


Improved Text Matching by Enhancing Mutual Information

摘要

Text matching is a core issue for question answering (QA), information retrieval (IR) and many other fields. We propose to reformulate the original text, i.e., generating a new text that is semantically equivalent to original text, to improve text matching degree. Intuitively, the generated text improves mutual information between two text sequences. We employ the generative adversarial network as the reformulation model where there is a discriminator to guide the text generating process. In this work, we focus on matching question and answers. The task is to rank answers based on QA matching degree. We first reformulate the original question without changing the asker's intent, then compute a relevance score for each answer. To evaluate the method, we collected questions and answers from Zhihu. In addition, we also conduct substantial experiments on public data such as SemEval and WikiQA to compare our method with existing methods. Experimental results demonstrate that after adding the reformulated question, the ranking performance across different matching models can be improved consistently, indicating that the reformulated question has enhanced mutual information and effectively bridged the semantic gap between QA.

关键词

LambdaRank; Question Rewrite

全文

PDF


Improving Language Modelling with Noise Contrastive Estimation

摘要

Neural language models do not scale well when the vocabulary is large. Noise contrastive estimation (NCE) is a sampling-based method that allows for fast learning with large vocabularies. Although NCE has shown promising performance in neural machine translation, its full potential has not been demonstrated in the language modelling literature. A sufficient investigation of the hyperparameters in the NCE-based neural language models was clearly missing. In this paper, we showed that NCE can be a very successful approach in neural language modelling when the hyperparameters of a neural network are tuned appropriately. We introduced the `search-then-converge' learning rate schedule for NCE and designed a heuristic that specifies how to use this schedule. The impact of the other important hyperparameters, such as the dropout rate and the weight initialisation range, was also demonstrated. Using a popular benchmark, we showed that appropriate tuning of NCE in neural language models outperforms the state-of-the-art single-model methods based on standard dropout and the standard LSTM recurrent neural networks.

关键词

Language Modelling;Deep Learning;NCE;Neural Network;LSTM;RNN;Partition Function;Perplexity

全文

PDF


Sentence Ordering and Coherence Modeling using Recurrent Neural Networks

摘要

Modeling the structure of coherent texts is a key NLP problem. The task of coherently organizing a given set of sentences has been commonly used to build and evaluate models that understand such structure. We propose an end-to-end unsupervised deep learning approach based on the set-to-sequence framework to address this problem. Our model strongly outperforms prior methods in the order discrimination task and a novel task of ordering abstracts from scientific articles. Furthermore, our work shows that useful text representations can be obtained by learning to order sentences. Visualizing the learned sentence representations shows that the model captures high-level logical structure in paragraphs. Our representations perform comparably to state-of-the-art pre-training methods on sentence similarity and paraphrase detection tasks.

关键词

Sentence Ordering; Coherence modeling; coherence; discourse coherence

全文

PDF


Eliciting Positive Emotion through Affect-Sensitive Dialogue Response Generation: A Neural Network Approach

摘要

An emotionally-competent computer agent could be a valuable assistive technology in performing various affective tasks. For example caring for the elderly, low-cost ubiquitous chat therapy, and providing emotional support in general, by promoting a more positive emotional state through dialogue system interaction. However, despite the increase of interest in this task, existing works face a number of shortcomings: system scalability, restrictive modeling, and weak emphasis on maximizing user emotional experience. In this paper, we build a fully data driven chat-oriented dialogue system that can dynamically mimic affective human interactions by utilizing a neural network architecture. In particular, we propose a sequence-to-sequence response generator that considers the emotional context of the dialogue. An emotion encoder is trained jointly with the entire network to encode and maintain the emotional context throughout the dialogue. The encoded emotion information is then incorporated in the response generation process. We train the network with a dialogue corpus that contains positive-emotion eliciting responses, collected through crowd-sourcing. Objective evaluation shows that incorporation of emotion into the training process helps reduce the perplexity of the generated responses, even when a small dataset is used. Subsequent subjective evaluation shows that the proposed method produces responses that are more natural and likely to elicit a more positive emotion.

关键词

neural network; affective computing; dialogue response generation; dialogue system; emotion

全文

PDF


CoChat: Enabling Bot and Human Collaboration for Task Completion

摘要

Chatbots have drawn significant attention of late in both industry and academia. For most task completion bots in the industry, human intervention is the only means of avoiding mistakes in complex real-world cases. However, to the best of our knowledge, there is no existing research work modeling the collaboration between task completion bots and human workers. In this paper, we introduce CoChat, a dialog management framework to enable effective collaboration between bots and human workers. In CoChat, human workers can introduce new actions at any time to handle previously unseen cases. We propose a memory-enhanced hierarchical RNN (MemHRNN) to handle the one-shot learning challenges caused by instantly introducing new actions in CoChat. Extensive experiments on real-world datasets well demonstrate that CoChat can relieve most of the human workers’ workload, and get better user satisfaction rates comparing to other state-of-the-art frameworks.

关键词

Dialogue System

全文

PDF


Fact Checking in Community Forums

摘要

Community Question Answering (cQA) forums are very popular nowadays, as they represent effective means for communities around particular topics to share information. Unfortunately, this information is not always factual. Thus, here we explore a new dimension in the context of cQA, which has been ignored so far: checking the veracity of answers to particular questions in cQA forums. As this is a new problem, we create a specialized dataset for it. We further propose a novel multi-faceted model, which captures information from the answer content (what is said and how), from the author profile (who says it), from the rest of the community forum (where it is said), and from external authoritative sources of information (external support). Evaluation results show a MAP value of 86.54, which is 21 points absolute above the baseline.

关键词

Fact checking; veracity; community question answering

全文

PDF


Personalizing a Dialogue System With Transfer Reinforcement Learning

摘要

It is difficult to train a personalized task-oriented dialogue system because the data collected from each individual is often insufficient. Personalized dialogue systems trained on a small dataset is likely to overfit and make it difficult to adapt to different user needs. One way to solve this problem is to consider a collection of multiple users as a source domain and an individual user as a target domain, and to perform transfer learning from the source domain to the target domain. By following this idea, we propose a PErsonalized Task-oriented diALogue (PETAL) system, a transfer reinforcement learning framework based on POMDP, to construct a personalized dialogue system. The PETAL system first learns common dialogue knowledge from the source domain and then adapts this knowledge to the target domain. The proposed PETAL system can avoid the negative transfer problem by considering differences between the source and target users in a personalized Q-function. Experimental results on a real-world coffee-shopping data and simulation data show that the proposed PETAL system can learn optimal policies for different users, and thus effectively improve the dialogue quality under the personalized setting.

关键词

transfer learning; reinforcement learning; task-oriented dialogue system; personalized dialogue system; transfer reinforcement learning; multi-turn dialogue system;

全文

PDF


Context Aware Conversational Understanding for Intelligent Agents With a Screen

摘要

We describe an intelligent context-aware conversational system that incorporates screen context information to service multimodal user requests. Screen content is used for disambiguation of utterances that refer to screen objects and for enabling the user to act upon screen objects using voice commands. We propose a deep learning architecture that jointly models the user utterance and the screen and incorporates detailed screen content features. Our model is trained to optimize end to end semantic accuracy across contextual and non-contextual functionality, therefore learns the desired behavior directly from the data. We show that this approach outperforms a rule-based alternative, and can be extended in a straightforward manner to new contextual use cases. We perform detailed evaluation of contextual and non-contextual use cases and show that our system displays accurate contextual behavior without degrading the performance of non-contextual user requests.

关键词

intelligent conversational agents; context; screen integration; spoken language understanding; deep neural networks

全文

PDF


Controlling Global Statistics in Recurrent Neural Network Text Generation

摘要

Recurrent neural network language models (RNNLMs) are an essential component for many language generation tasks such as machine translation, summarization, and automated conversation. Often, we would like to subject the text generated by the RNNLM to constraints, in order to overcome systemic errors (e.g. word repetition) or achieve application-specific goals (e.g. more positive sentiment). In this paper, we present a method for training RNNLMs to simultaneously optimize likelihood and follow a given set of statistical constraints on text generation.  The problem is challenging because the statistical constraints are defined over aggregate model behavior, rather than model parameters, meaning that a straightforward parameter regularization approach is insufficient.  We solve this problem using a dynamic regularizer that updates as training proceeds, based on the generative behavior of the RNNLMs.  Our experiments show that the dynamic regularizer outperforms both generic training and a static regularization baseline.  The approach is successful at improving word-level repetition statistics by a factor of four in RNNLMs on a definition modeling task.  It also improves model perplexity when the statistical constraints are $n$-gram statistics taken from a large corpus.

关键词

recurrent neural network; natural language generation; regularization

全文

PDF


Few Shot Transfer Learning BetweenWord Relatedness and Similarity Tasks Using A Gated Recurrent Siamese Network

摘要

Word similarity and word relatedness are fundamental to natural language processing and more generally, understanding how humans relate concepts in semantic memory. A growing number of datasets are being proposed as evaluation benchmarks,however, the heterogeneity and focus of each respective dataset makes it difficult to draw plausible conclusions as to how a unified semantic model would perform. Additionally, we want to identify the transferability of knowledge obtained from one task to another, within the same domain and across domains. Hence, this paper first presents an evaluation and comparison of eight chosen datasets tested using the best performing regression models. As a baseline, we present regression models that incorporate both lexical featuresand word embeddings to produce consistent and competitive results compared to the state of the art.We present our main contribution, the best performing model across seven of the eight datasets - a Gated Recurrent Siamese Networkthat learns relationships between lexical word definitions.A parameter transfer learning strategy is employed for theSiamese Network. Subsequently, we present a secondary contribution which is the best performing non-sequential model:an Inductive and Transductive Transfer Learning strategy fortransferring decision trees within a Random Forest to a target task that is learned from only few instances. The method involves measuring semantic distance between hidden factored matrix representations of decision tree traversal matrices.

关键词

Transfer Learning, Siamese Networks, Random Forests

全文

PDF


Question-Answering with Grammatically-Interpretable Representations

摘要

We introduce an architecture, the Tensor Product RecurrentNetwork (TPRN). In our application of TPRN, internal representations—learned by end-to-end optimization in a deep neural network performing a textual question-answering(QA) task—can be interpreted using basic concepts from linguistic theory. No performance penalty need be paid for this increased interpretability: the proposed model performs comparably to a state-of-the-art system on the SQuAD QA task.The internal representation which is interpreted is a Tensor Product Representation: for each input word, the model selects a symbol to encode the word, and a role in which to place the symbol, and binds the two together. The selection is via soft attention. The overall interpretation is built from interpretations of the symbols, as recruited by the trained model, and interpretations of the roles as used by the model. We find support for our initial hypothesis that symbols can be interpreted as lexical-semantic word meanings, while roles can be interpreted as approximations of grammatical roles (or categories)such as subject, wh-word, determiner, etc. Fine-grained analysis reveals specific correspondences between the learned roles and parts of speech as assigned by a standard tagger(Toutanova et al. 2003), and finds several discrepancies in the model’s favor. In this sense, the model learns significant aspectsof grammar, after having been exposed solely to linguistically unannotated text, questions, and answers: no prior linguistic knowledge is given to the model. What is given is the means to build representations using symbols and roles, with an inductive bias favoring use of these in an approximately discrete manner.

关键词

Question Answering, Tensor Product Representation (TPR), Deep Learning

全文

PDF


Canonical Correlation Inference for Mapping Abstract Scenes to Text

摘要

We describe a technique for structured prediction, based on canonical correlation analysis. Our learning algorithm finds two projections for the input and the output spaces that aim at projecting a given input and its correct output into points close to each other. We demonstrate our technique on a language-vision problem, namely the problem of giving a textual description to an "abstract scene".

关键词

全文

PDF


Exploring the Terrain of Metaphor Novelty: A Regression-Based Approach for Automatically Scoring Metaphors

摘要

Automatically scoring metaphor novelty has been largely unexplored, but could be of benefit to a wide variety of NLP applications. We introduce a large, publicly available metaphor novelty dataset to stimulate research in this area, and propose a regression-based approach to automatically score the novelty of potential metaphors that are expressed as word pairs. We additionally investigate which types of features are most useful for this task, and show that our approach outperforms baseline metaphor novelty scoring and standard metaphor detection approaches on this task.

关键词

metaphor; metaphor novelty; figurative language; natural language processing

全文

PDF


Two Knowledge-based Methods for High-Performance Sense Distribution Learning

摘要

Knowing the correct distribution of senses within a corpus can potentially boost the performance of Word Sense Disambiguation (WSD) systems by many points. We present two fully automatic and language-independent methods for computing the distribution of senses given a raw corpus of sentences. Intrinsic and extrinsic evaluations show that our methods outperform the current state of the art in sense distribution learning and the strongest baselines for the most frequent sense in multiple languages and on domain-specific test sets. Our sense distributions are available at http://trainomatic.org.

关键词

Word sense disambiguation, sense distribution learning, most frequent sense, wsd, nlp

全文

PDF


Attention-based Belief or Disbelief Feature Extraction for Dependency Parsing

摘要

Existing neural dependency parsers usually encode each word in a sentence with bi-directional LSTMs, and estimate the score of an arc from the LSTM representations of the head and the modifier, possibly missing relevant context information for the arc being considered. In this study, we propose a neural feature extraction method that learns to extract arc-specific features. We apply a neural network-based attention method to collect evidences for and against each possible head-modifier pair, with which our model computes certainty scores of belief and disbelief, and determines the final arc score by subtracting the score of disbelief from the one of belief. By explicitly introducing two kinds of evidences, the arc candidates can compete against each other based on more relevant information, especially for the cases where they share the same head or modifier. It makes possible to better discriminate two or more competing arcs by presenting their rivals (disbelief evidence). Experiments on various datasets show that our arc-specific feature extraction mechanism significantly improves the performance of bi-directional LSTM-based models by explicitly modeling long-distance dependencies. For both English and Chinese, the proposed model achieve a higher accuracy on dependency parsing task than most existing neural attention-based models.

关键词

全文

PDF


Multi-Task Learning For Parsing The Alexa Meaning Representation Language

摘要

The Alexa Meaning Representation Language (AMRL) is a compositional graph-based semantic representation that includes fine-grained types, properties, actions, and roles and can represent a wide variety of spoken language.  AMRL increases the ability of virtual assistants to represent more complex requests, including logical and conditional statements as well as ones with nested clauses. Due to this representational capacity, the acquisition of large scale data resources is challenging, which limits the accuracy of resulting models. This paper has two primary contributions. First, we develop a linearization of AMRL graphs along with a deep multi-task model that predicts fine-grained types, properties, and intents. Second, we show how to jointly train a model that predicts an existing representation for spoken language understanding (SLU) along with the linearized AMRL parse. The resulting model, which leverages learned embeddings from both tasks, is able to predict the AMRL representation more accurately than other approaches, decreasing the error rates in the full parse by 3.56% absolute and reducing the amount of natively annotated data needed to train accurate parsing models.

关键词

全文

PDF


Bayesian Verb Sense Clustering

摘要

This work performs verb sense induction and clustering based on observed syntactic distributions in a large corpus. VerbNet is a hierarchical clustering of verbs and a useful semantic resource. We address the main drawbacks of VerbNet, by proposing a Bayesian model to build VerbNet-like clusters automatically and with full coverage. Relative to the prior state of the art, we improve accuracy on verb sense induction by over 20% absolute F1. We then propose a new model, inspired by the positive pointwise mutual information (PPMI). Our PPMI-based mixture model permits an extremely efficient sampler, while improving performance. Our best model shows a 4.5% absolute F1 improvement over the best non-PPMI model, with over an order of magnitude less computation time. Though this model is inspired by clustering verb senses, it may be applicable in other situations where multiple items are being sampled as a group.

关键词

Bayesian mixture models, semantic clusters, verb resources

全文

PDF


DeepType: Multilingual Entity Linking by Neural Type System Evolution

摘要

The wealth of structured (e.g. Wikidata) and unstructured data about the world available today presents an incredible opportunity for tomorrow's Artificial Intelligence. So far, integration of these two different modalities is a difficult process, involving many decisions concerning how best to represent the information so that it will be captured or useful, and hand-labeling large amounts of data.DeepType overcomes this challenge by explicitly integrating symbolic information into the reasoning process of a neural network with a type system.First we construct a type system, and second, we use it to constrain the outputs of a neural network to respect the symbolic structure. We achieve this by reformulating the design problem into a mixed integer problem: create a type system and subsequently train a neural network with it. In this reformulation discrete variables select which parent-child relations from an ontology are types within the type system, while continuous variables control a classifier fit to the type system. The original problem cannot be solved exactly, so we propose a 2-step algorithm: 1) heuristic search or stochastic optimization over discrete variables that define a type system informed by an Oracle and a Learnability heuristic, 2) gradient descent to fit classifier parameters.We apply DeepType to the problem of Entity Linking on three standard datasets (i.e. WikiDisamb30, CoNLL (YAGO), TAC KBP 2010) and find that it outperforms all existing solutions by a wide margin, including approaches that rely on a human-designed type system or recent deep learning-based entity embeddings, while explicitly using symbolic information lets it integrate new entities without retraining.

关键词

Entity Linking Deep Learning Evolution NLP Machine Learning

全文

PDF


Order-Planning Neural Text Generation From Structured Data

摘要

Generating texts from structured data (e.g., a table) is important for various natural language processing tasks such as question answering and dialog systems. In recent studies, researchers use neural language models and encoder-decoder frameworks for table-to-text generation. However, these neural network-based approaches typically do not model the order of content during text generation. When a human writes a summary based on a given table, he or she would probably consider the content order before wording. In this paper, we propose an order-planning text generation model, where order information is explicitly captured by link-based attention. Then a self-adaptive gate combines the link-based attention with traditional content-based attention. We conducted experiments on the WikiBio dataset and achieve higher performance than previous methods in terms of BLEU, ROUGE, and NIST scores; we also performed ablation tests to analyze each component of our model.

关键词

text generation; order planning; neural network

全文

PDF


A Multi-View Fusion Neural Network for Answer Selection

摘要

Community question answering aims at choosing the most appropriate answer for a given question, which is important in many NLP applications. Previous neural network-based methods consider several different aspects of information through calculating attentions. These different kinds of attentions are always simply summed up and can be seen as a ``single view", causing severe information loss. To overcome this problem, we propose a Multi-View Fusion Neural Network, where each attention component generates a ``view'' of the QA pair and a fusion RNN integrates the generated views to form a more holistic representation.    In this fusion RNN method, a filter gate  collects  important information of  input and directly adds it to the output, which borrows the idea of residual networks.    Experimental results on the WikiQA and SemEval-2016 CQA datasets demonstrate that our proposed model outperforms the state-of-the-art methods.

关键词

deep learning, answer selection, multi-view

全文

PDF


Generating Sentences Using a Dynamic Canvas

摘要

We introduce the Attentive Unsupervised Text (W)riter (AUTR), which is a word level generative model for natural language. It uses a recurrent neural network with a dynamic attention and canvas memory mechanism to iteratively construct sentences. By viewing the state of the memory at intermediate stages and where the model is placing its attention, we gain insight into how it constructs sentences. We demonstrate that AUTR learns a meaningful latent representation for each sentence, and achieves competitive log-likelihood lower bounds whilst being computationally efficient. It is effective at generating and reconstructing sentences, as well as imputing missing words.

关键词

NLP; bayesian; variational; generative; sentences

全文

PDF


Deconvolutional Latent-Variable Model for Text Sequence Matching

摘要

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolutional networks as the sequence decoder (generator), providing learned latent codes with more semantic information and better generalization. Our model, trained in an unsupervised manner, yields stronger empirical predictive performance than a decoder based on Long Short-Term Memory (LSTM), with less parameters and considerably faster training. Further, we apply it to text sequence-matching problems. The proposed model significantly outperforms several strong sentence-encoding baselines, especially in the semi-supervised setting.

关键词

Variational auto-encoder, text sequence matching, deconvolutional networks

全文

PDF


DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

摘要

Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively. Attention mechanisms have recently attracted enormous interest due to their highly parallelizable computation, significantly less training time, and flexibility in modeling dependencies. We propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). A light-weight neural net, "Directional Self-Attention Network (DiSAN)," is then proposed to learn sentence embedding, based solely on the proposed attention without any RNN/CNN structure. DiSAN is only composed of a directional self-attention with temporal order encoded, followed by a multi-dimensional attention that compresses the sequence into a vector representation. Despite its simple form, DiSAN outperforms complicated RNN models on both prediction quality and time efficiency. It achieves the best test accuracy among all sentence encoding methods and improves the most recent best result by 1.02% on the Stanford Natural Language Inference (SNLI) dataset, and shows state-of-the-art test accuracy on the Stanford Sentiment Treebank (SST), Multi-Genre natural language inference (MultiNLI), Sentences Involving Compositional Knowledge (SICK), Customer Review, MPQA, TREC question-type classification and Subjectivity (SUBJ) datasets.

关键词

Deep Learning; Attention Mechanism; Natural Language Processing; Sentence Encoding; Text Classification

全文

PDF


Improving Variational Encoder-Decoders in Dialogue Generation

摘要

Variational encoder-decoders (VEDs) have shown promising results in dialogue generation. However, the latent variable distributions are usually approximated by a much simpler model than the powerful RNN structure used for encoding and decoding, yielding the KL-vanishing problem and inconsistent training objective. In this paper, we separate the training step into two phases: The first phase learns to autoencode discrete texts into continuous embeddings, from which the second phase learns to generalize latent representations by reconstructing the encoded embedding.  In this case, latent variables are sampled by transforming Gaussian noise through multi-layer perceptrons and are trained with a separate VED model, which has the potential of realizing a much more flexible distribution. We compare our model with current popular models and the experiment demonstrates substantial improvement in both metric-based and human evaluations.

关键词

全文

PDF


Neural Cross-Lingual Entity Linking

摘要

A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts. The problem exacerbates with cross-lingual EL which involves linking mentions written in non-English documents to entries in the English Wikipedia: to compare textual clues across languages we need to compute similarity between textual fragments across languages. In this paper, we propose a neural EL model that trains fine-grained similarities and dissimilarities between the query and candidate document from multiple perspectives, combined with convolution and tensor networks. Further, we show that this English-trained system can be applied, in zero-shot learning, to other languages by making surprisingly effective use of multi-lingual embeddings. The proposed system has strong empirical evidence yielding state-of-the-art results in English as well as cross-lingual: Spanish and Chinese TAC 2015 datasets.

关键词

NLP; Information Extraction; Entity Linking; Entity Disambiguation

全文

PDF


Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization

摘要

Automated multi-document extractive text summarization is a widely studied research problem in the field of natural language understanding. Such extractive mechanisms compute in some form the worthiness of a sentence to be included into the summary. While the conventional approaches rely on human crafted document-independent features to generate a summary, we develop a data-driven novel summary system called HNet, which exploits the various semantic and compositional aspects latent in a sentence to capture document independent features. The network learns sentence representation in a way that, salient sentences are closer in the vector space than non-salient sentences. This semantic and compositional feature vector is then concatenated with the document-dependent features for sentence ranking. Experiments on the DUC benchmark datasets (DUC-2001, DUC-2002 and DUC-2004) indicate that our model shows significant performance gain of around 1.5-2 points in terms of ROUGE score compared with the state-of-the-art baselines.

关键词

Natural Language; Summarization; Deep Learning

全文

PDF


Spectral Word Embedding with Negative Sampling

摘要

In this work, we investigate word embedding algorithms in the context of natural language processing. In particular, we examine the notion of ``negative examples'', the unobserved or insignificant word-context co-occurrences, in spectral methods. we provide a new formulation for the word embedding problem by proposing a new intuitive objective function that perfectly justifies the use of negative examples. In fact, our algorithm not only learns from the important word-context co-occurrences, but also it learns from the abundance of unobserved or insignificant co-occurrences to improve the distribution of words in the latent embedded space. We analyze the algorithm theoretically and provide an optimal solution for the problem using spectral analysis. We have trained various word embedding algorithms on articles of Wikipedia with 2.1 billion tokens and show that negative sampling can boost the quality of spectral methods. Our algorithm provides results as good as the state-of-the-art but in a much faster and efficient way.

关键词

Word Embedding; Natural Language Processing; Unsupervised Learning; Matrix Factorization; Spectral Algorithms; Singular Value Decomposition

全文

PDF


Variational Recurrent Neural Machine Translation

摘要

Partially inspired by successful applications of variational recurrent neural networks, we propose a novel variational recurrent neural machine translation (VRNMT) model in this paper. Different from the variational NMT, VRNMT introduces a series of latent random variables to model the translation procedure of a sentence in a generative way, instead of a single latent variable. Specifically, the latent random variables are included into the hidden states of the NMT decoder with elements from the variational autoencoder. In this way, these variables are recurrently generated, which enables them to further capture strong and complex dependencies among the output translations at different timesteps. In order to deal with the challenges in performing efficient posterior inference and large-scale training during the incorporation of latent variables, we build a neural posterior approximator, and equip it with a reparameterization technique to estimate the variational lower bound. Experiments on Chinese-English and English-German translation tasks demonstrate that the proposed model achieves significant improvements over both the conventional and variational NMT models.

关键词

Machine Translation

全文

PDF


Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method

摘要

Generating plausible and fluent sentence with desired properties has long been a challenge. Most of the recent works use recurrent neural networks (RNNs) and their variants to predict following words given previous sequence and target label. In this paper, we propose a novel framework to generate constrained sentences via Gibbs Sampling. The candidate sentences are revised and updated iteratively, with sampled new words replacing old ones. Our experiments show the effectiveness of the proposed method to generate plausible and diverse sentences.

关键词

natural language generation; MCMC; gibbs sampling

全文

PDF


Source-Target Inference Models for Spatial Instruction Understanding

摘要

Models that can execute natural language instructions for situated robotic tasks such as assembly and navigation have several useful applications in homes, offices, and remote scenarios.We study the semantics of spatially-referred configuration and arrangement instructions, based on the challenging Bisk-2016 blank-labeled block dataset. This task involves finding a source block and moving it to the target position (mentioned via a reference block and offset), where the blocks have no names or colors and are just referred to via spatial location features.We present novel models for the subtasks of source block classification and target position regression, based on joint-loss language and spatial-world representation learning, as well as CNN-based and dual attention models to compute the alignment between the world blocks and the instruction phrases. For target position prediction, we compare two inference approaches: annealed sampling via policy gradient versus expectation inference via supervised regression. Our models achieve the new state-of-the-art on this task, with an improvement of 47% on source block accuracy and 22% on target position distance.

关键词

全文

PDF


Cross Temporal Recurrent Networks for Ranking Question Answer Pairs

摘要

Temporal gates play a significant role in modern recurrent-based neural encoders, enabling fine-grained control over recursive compositional operations over time. In recurrent models such as the long short-term memory (LSTM), temporal gates control the amount of information retained or discarded over time, not only playing an important role in influencing the learned representations but also serving as a protection against vanishing gradients. This paper explores the idea of learning temporal gates for sequence pairs (question and answer), jointly influencing the learned representations in a pairwise manner. In our approach, temporal gates are learned via 1D convolutional layers and then subsequently cross applied across question and answer for joint learning. Empirically, we show that this conceptually simple sharing of temporal gates can lead to competitive performance across multiple benchmarks. Intuitively, what our network achieves can be interpreted as learning representations of question and answer pairs that are aware of what each other is remembering or forgetting, i.e., pairwise temporal gating. Via extensive experiments, we show that our proposed model achieves state-of-the-art performance on two community-based QA datasets and competitive performance on one factoid-based QA dataset.

关键词

Deep Learning; Neural Network; Question Answering; Community Question Answering; Recurrent Neural Network

全文

PDF


Guiding Exploratory Behaviors for Multi-Modal Grounding of Linguistic Descriptions

摘要

A major goal of grounded language learning research is to enable robots to connect language predicates to a robot's physical interactive perception of the world. Coupling object exploratory behaviors such as grasping, lifting, and looking with multiple sensory modalities (e.g., audio, haptics, and vision) enables a robot to ground non-visual words like ``heavy'' as well as visual words like ``red''. A major limitation of existing approaches to multi-modal language grounding is that a robot has to exhaustively explore training objects with a variety of actions when learning a new such language predicate. This paper proposes a method for guiding a robot's behavioral exploration policy when learning a novel predicate based on known grounded predicates and the novel predicate's linguistic relationship to them. We demonstrate our approach on two datasets in which a robot explored large sets of objects and was tasked with learning to recognize whether novel words applied to those objects.

关键词

Multi-modal grounding; NLP; Human-robot interaction

全文

PDF


Learning Better Name Translation for Cross-Lingual Wikification

摘要

A notable challenge in cross-lingual wikification is the problem of retrieving English Wikipedia title candidates given a non-English mention, a step that requires translating names written in a foreign language into English. Creating training data for name translation requires significant amount of human efforts. In order to cover as many languages as possible, we propose a probabilistic model that leverages indirect supervision signals in a knowledge base. More specifically, the model learns name translation from title pairs obtained from the inter-language links in Wikipedia. The model jointly considers word alignment and word transliteration. Comparing to 6 other approaches on 9 languages, we show that the proposed model outperforms others not only on the transliteration metric, but also on the ability to generate target English titles for a cross-lingual wikifier. Consequently, as we show, it improves the end-to-end performance of a cross-lingual wikifier on the TAC 2016 EDL dataset.

关键词

Information Extraction; Name Translation; Cross-Lingual; Entity Llinking; Wikification

全文

PDF


Learning Latent Opinions for Aspect-level Sentiment Classification

摘要

Aspect-level sentiment classification aims at detecting the sentiment expressed towards a particular target in a sentence. Based on the observation that the sentiment polarity is often related to specific spans in the given sentence, it is possible to make use of such information for better classification. On the other hand, such information can also serve as justifications associated with the predictions.We propose a segmentation attention based LSTM model which can effectively capture the structural dependencies between the target and the sentiment expressions with a linear-chain conditional random field (CRF) layer. The model simulates human's process of inferring sentiment information when reading: when given a target, humans tend to search for surrounding relevant text spans in the sentence before making an informed decision on the underlying sentiment information.We perform sentiment classification tasks on publicly available datasets on online reviews across different languages from SemEval tasks and social comments from Twitter. Extensive experiments show that our model achieves the state-of-the-art performance while extracting interpretable sentiment expressions. 

关键词

natural language processing;sentiment classification

全文

PDF


MathDQN: Solving Arithmetic Word Problems via Deep Reinforcement Learning

摘要

Designing an automatic solver for math word problems has been considered as a crucial step towards general AI, with the ability of natural language understanding and logical inference. The state-of-the-art performance was achieved by enumerating all the possible expressions from the quantities in the text and customizing a scoring function to identify the one with the maximum probability. However, it incurs exponential search space with the number of quantities and beam search has to be applied to trade accuracy for efficiency. In this paper, we make the first attempt of applying deep reinforcement learning to solve arithmetic word problems. The motivation is that deep Q-network has witnessed success in solving various problems with big search space and achieves promising performance in terms of both accuracy and running time. To fit the math problem scenario, we propose our MathDQN that is customized from the general deep reinforcement learning framework. Technically, we design the states, actions, reward function, together with a feed-forward neural network as the deep Q-network. Extensive experimental results validate our superiority over state-of-the-art methods. Our MathDQN yields remarkable improvement on most of datasets and boosts the average precision among all the benchmark datasets by 15\%.

关键词

全文

PDF


Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization

摘要

Neural machine translation (NMT) heavily relies on parallel bilingual data for training. Since large-scale, high-quality parallel corpora are usually costly to collect, it is appealing to exploit monolingual corpora to improve NMT. Inspired by the law of total probability, which connects the probability of a given target-side monolingual sentence to the conditional probability of translating from a source sentence to the target one, we propose to explicitly exploit this connection to learn from and regularize the training of NMT models using monolingual data. The key technical challenge of this approach is that there are exponentially many source sentences for a target monolingual sentence while computing the sum of the conditional probability given each possible source sentence. We address this challenge by leveraging the dual translation model (target-to-source translation) to sample several mostly likely source-side sentences and avoid enumerating all possible candidate source sentences. That is, we transfer the knowledge contained in the dual model to boost the training of the primal model (source-to-target translation), and we call such an approach dual transfer learning. Experiment results on English-French and German-English tasks demonstrate that dual transfer learning achieves significant improvement over several strong baselines and obtains new state-of-the-art results.

关键词

transfer learning; semi-supervised neural machine translation; importance sampling

全文

PDF


A Neural Transition-Based Approach for Semantic Dependency Graph Parsing

摘要

Semantic dependency graph has been recently proposed as an extension of tree-structured syntactic or semantic representation for natural language sentences. It particularly features the structural property of multi-head, which allows nodes to have multiple heads, resulting in a directed acyclic graph(DAG) parsing problem. Yet most statistical parsers focused exclusively on shallow bi-lexical tree structures, DAG parsing remains under-explored. In this paper, we propose a neural transition-based parser, using a variant of list-based arc-eager transition algorithm for dependency graph parsing. Particularly, two non-trivial improvements are proposed for representing the key components of the transition system, to better capture the semantics of segments and internal sub-graph structures. We test our parser on the SemEval-2016 Task 9 dataset (Chinese) and the SemEval-2015 Task 18 dataset (English). On both benchmark datasets, we obtain superior or comparable results to the best performing systems. Our parser can be further improved with a simple ensemble mechanism, resulting in the state-of-the-art performance.

关键词

Semantic Dependency Graph; Transition-Based Parsing; Bi-LSTM Subtraction; Incremental Tree-LSTM

全文

PDF


StarSpace: Embed All The Things!

摘要

We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification,ranking tasks such as information retrieval/web search,collaborative filtering-based  or content-based recommendation,embedding of multi-relational graphs, and learning word, sentence or document level embeddings.In each case the model works by embedding those entities comprised of discrete features and comparing them against each other -- learning similarities dependent on the task.Empirical results on a number of tasks show that StarSpace is highly competitive with existing methods, whilst also being generally applicable to new cases where those methods are not.

关键词

Nature Language Processing; Text Classification; Knowledge Representation; Recommender Systems

全文

PDF


Word Attention for Sequence to Sequence Text Understanding

摘要

Attention mechanism has been a key component in Recurrent Neural Networks (RNNs) based sequence to sequence learning framework, which has been adopted in many text understanding tasks, such as neural machine translation and abstractive summarization. In these tasks, the attention mechanism models how important each part of the source sentence is to generate a target side word. To compute such importance scores, the attention mechanism summarizes the source side information in the encoder RNN hidden states (i.e., h_t), and then builds a context vector for a target side word upon a subsequence representation of the source sentence, since h_t actually summarizes the information of the subsequence containing the first t-th words in the source sentence. We in this paper, show that an additional attention mechanism called word attention, that builds itself upon word level representations, significantly enhances the performance of sequence to sequence learning. Our word attention can enrich the source side contextual representation by directly promoting the clean word level information in each step. Furthermore, we propose to use contextual gates to dynamically combine the subsequence level and word level contextual information. Experimental results on abstractive summarization and neural machine translation show that word attention significantly improve over strong baselines.

关键词

全文

PDF


Knowledge Enhanced Hybrid Neural Network for Text Matching

摘要

Long text brings a big challenge to neural network based text matching approaches due to their complicated structures. To tackle the challenge, we propose a knowledge enhanced hybrid neural network (KEHNN) that leverages prior knowledge to identify useful information and filter out noise in long text and performs matching from multiple perspectives. The model fuses prior knowledge into word representations by knowledge gates and establishes three matching channels with words, sequential structures of text given by Gated Recurrent Units (GRUs), and knowledge enhanced representations. The three channels are processed by a convolutional neural network to generate high level features for matching, and the features are synthesized as a matching score by a multilayer perceptron. In this paper, we focus on exploring the use of taxonomy knowledge for text matching. Evaluation results from extensive experiments on public data sets of question answering and conversation show that KEHNN can significantly outperform state-of-the-art matching models and particularly improve matching accuracy on pairs with long text.

关键词

text matching; neural network; deep learning; taxonamy knowledge

全文

PDF


Neural Response Generation With Dynamic Vocabularies

摘要

We study response generation for open domain conversation in chatbots. Existing methods assume that words in responses are generated from an identical vocabulary regardless of their inputs, which not only makes them vulnerable to generic patterns and irrelevant noise, but also causes a high cost in decoding. We propose a dynamic vocabulary sequence-to-sequence (DVS2S) model which allows each input to possess their own vocabulary in decoding. In training, vocabulary construction and response generation are jointly learned by maximizing a lower bound of the true objective with a Monte Carlo sampling method. In inference, the model dynamically allocates a small vocabulary for an input with the word prediction model, and conducts decoding only with the small vocabulary. Because of the dynamic vocabulary mechanism, DVS2S eludes many generic patterns and irrelevant words in generation, and enjoys efficient decoding at the same time. Experimental results on both automatic metrics and human annotations show that DVS2S can significantly outperform state-of-the-art methods in terms of response quality, but only requires 60% decoding time compared to the most efficient baseline.

关键词

chatbot; dialogue system; response generation; generation

全文

PDF


Learning to Extract Coherent Summary via Deep Reinforcement Learning

摘要

Coherence plays a critical role in producing a high-quality summary from a document. In recent years, neural extractive summarization is becoming increasingly attractive. However, most of them ignore the coherence of summaries when extracting sentences. As an effort towards extracting coherent summaries, we propose a neural coherence model to capture the cross-sentence semantic and syntactic coherence patterns. The proposed neural coherence model obviates the need for feature engineering and can be trained in an end-to-end fashion using unlabeled data. Empirical results show that the proposed neural coherence model can efficiently capture the cross-sentence coherence patterns. Using the combined output of the neural coherence model and ROUGE package as the reward, we design a reinforcement learning method to train a proposed neural extractive summarizer which is named Reinforced Neural Extractive Summarization (RNES) model. The RNES model learns to optimize coherence and informative importance of the summary simultaneously. The experimental results show that the proposed RNES outperforms existing baselines and achieves state-of-the-art performance in term of ROUGE on CNN/Daily Mail dataset. The qualitative evaluation indicates that summaries produced by RNES are more coherent and readable.

关键词

extractive summarization; deep reinforcement learning; coherence modeling

全文

PDF


Hierarchical Recurrent Attention Network for Response Generation

摘要

We study multi-turn response generation in chatbots where a response is generated according to a conversation context.   Existing work has modeled the hierarchy of the context, but does not pay enough attention to the fact that words and utterances in the context are differentially important. As a result, they may lose important information in context and generate irrelevant responses. We propose a hierarchical recurrent attention network (HRAN) to model both the hierarchy and the importance variance in a unified framework. In HRAN, a hierarchical attention mechanism attends to important parts within and among utterances with word level attention and utterance level attention respectively.

关键词

全文

PDF


How Images Inspire Poems: Generating Classical Chinese Poetry from Images with Memory Networks

摘要

With the recent advances of neural models and natural language processing, automatic generation of classical Chinese poetry has drawn significant attention due to its artistic and cultural value. Previous works mainly focus on generating poetry given keywords or other text information, while visual inspirations for poetry have been rarely explored. Generating poetry from images is much more challenging than generating poetry from text, since images contain very rich visual information which cannot be described completely using several keywords, and a good poem should convey the image accurately. In this paper, we propose a memory based neural model which exploits images to generate poems. Specifically, an Encoder-Decoder model with a topic memory network is proposed to generate classical Chinese poetry from images. To the best of our knowledge, this is the first work attempting to generate classical Chinese poetry from images with neural networks. A comprehensive experimental investigation with both human evaluation and quantitative analysis demonstrates that the proposed model can generate poems which convey images accurately.

关键词

Image; Poetry Generation; Memory Networks

全文

PDF


Learning Multi-Modal Word Representation Grounded in Visual Context

摘要

Representing the semantics of words is a long-standing problem for the natural language processing community. Most methods compute word semantics given their textual context in large corpora. More recently, researchers attempted to integrate perceptual and visual features. Most of these works consider the visual appearance of objects to enhance word representations but they ignore the visual environment and context in which objects appear. We propose to unify text-based techniques with vision-based techniques by simultaneously leveraging textual and visual context to learn multimodal word embeddings. We explore various choices for what can serve as a visual context and present an end-to-end method to integrate visual context elements in a multimodal skip-gram model. We provide experiments and extensive analysis of the obtained results.

关键词

multimodal representations; representation learning; word similarity; feature-norm prediction; concreteness; word embeddings; visual context; spatial information

全文

PDF


Memory Fusion Network for Multi-view Sequential Learning

摘要

Multi-view sequential learning is a fundamental problem in machine learning dealing with multi-view sequences. In a multi-view sequence, there exists two forms of interactions between different views: view-specific interactions and cross-view interactions. In this paper, we present a new neural architecture for multi-view sequential learning called the Memory Fusion Network (MFN) that explicitly accounts for both interactions in a neural architecture and continuously models them through time. The first component of the MFN is called the System of LSTMs, where view-specific interactions are learned in isolation through assigning an LSTM function to each view. The cross-view interactions are then identified using a special attention mechanism called the Delta-memory Attention Network (DMAN) and summarized through time with a Multi-view Gated Memory. Through extensive experimentation, MFN is compared to various proposed approaches for multi-view sequential learning on multiple publicly available benchmark datasets. MFN outperforms all the multi-view approaches. Furthermore, MFN outperforms all current state-of-the-art models, setting new state-of-the-art results for all three multi-view datasets.

关键词

Sentiment Analysis; Emotion Recognition; Personality Traits Recognition; Multimodal Fusion

全文

PDF


Multi-attention Recurrent Network for Human Communication Comprehension

摘要

Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape the communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state- of-the-art results performance in all the datasets.

关键词

Multimodal Machine Learning; Attention Networks; Attention Modeling; Natural Language Processing

全文

PDF


Chinese LIWC Lexicon Expansion via Hierarchical Classification of Word Embeddings with Sememe Attention

摘要

Linguistic Inquiry and Word Count (LIWC) is a word counting software tool which has been used for quantitative text analysis in many fields. Due to its success and popularity, the core lexicon has been translated into Chinese and many other languages. However, the lexicon only contains several thousand of words, which is deficient compared with the number of common words in Chinese. Current approaches often require manually expanding the lexicon, but it often takes too much time and requires linguistic experts to extend the lexicon. To address this issue, we propose to expand the LIWC lexicon automatically. Specifically, we consider it as a hierarchical classification problem and utilize the Sequence-to-Sequence model to classify words in the lexicon. Moreover, we use the sememe information with the attention mechanism to capture the exact meanings of a word, so that we can expand a more precise and comprehensive lexicon. The experimental results show that our model has a better understanding of word meanings with the help of sememes and achieves significant and consistent improvements compared with the state-of-the-art methods. The source code of this paper can be obtained from https://github.com/thunlp/Auto_CLIWC.

关键词

lexicon expansion; hierarchical classification; machine learning; natural language processing

全文

PDF


Large Scaled Relation Extraction With Reinforcement Learning

摘要

Sentence relation extraction aims to extract relational facts from sentences, which is an important task in natural language processing field. Previous models rely on the manually labeled supervised dataset. However, the human annotation is costly and limits to the number of relation and data size, which is difficult to scale to large domains. In order to conduct largely scaled relation extraction, we utilize an existing knowledge base to heuristically align with texts, which not rely on human annotation and easy to scale. However, using distant supervised data for relation extraction is facing a new challenge: sentences in the distant supervised dataset are not directly labeled and not all sentences that mentioned an entity pair can represent the relation between them. To solve this problem, we propose a novel model with reinforcement learning. The relation of the entity pair is used as distant supervision and guide the training of relation extractor with the help of reinforcement learning method. We conduct two types of experiments on a publicly released dataset. Experiment results demonstrate the effectiveness of the proposed method compared with baseline models, which achieves 13.36\% improvement.

关键词

relation extraction; reinforcement learning; large scaled

全文

PDF


End-to-End Quantum-like Language Models with Application to Question Answering

摘要

Language Modeling (LM) is a fundamental research topic in a range of areas. Recently, inspired by quantum theory, a novel Quantum Language Model (QLM) has been proposed for Information Retrieval (IR). In this paper, we aim to broaden the theoretical and practical basis of QLM. We develop a Neural Network based Quantum-like Language Model (NNQLM) and apply it to Question Answering. Specifically, based on word embeddings, we design a new density matrix, which represents a sentence (e.g., a question or an answer) and encodes a mixture of semantic subspaces. Such a density matrix, together with a joint representation of the question and the answer, can be integrated into neural network architectures (e.g., 2-dimensional convolutional neural networks). Experiments on the TREC-QA and WIKIQA datasets have verified the effectiveness of our proposed models.

关键词

Question Answering; Quantum-like Language Model; Neural Network

全文

PDF


Adaptive Co-attention Network for Named Entity Recognition in Tweets

摘要

In this study, we investigate the problem of named entity recognition for tweets. Named entity recognition is an important task in natural language processing and has been carefully studied in recent decades.  Previous named entity recognition methods usually only used the textual content when processing tweets. However, many tweets contain not only textual content, but also images. Such visual information is also valuable in the name entity recognition task. To make full use of textual and visual information, this paper proposes a novel method to process tweets that contain multimodal information. We extend a bi-directional long short term memory network with conditional random fields and an adaptive co-attention network to achieve this task.  To evaluate the proposed methods, we constructed a large scale labeled dataset that contained multimodal tweets. Experimental results demonstrated that the proposed method could achieve a better performance than the previous methods in most cases.

关键词

Deep Learning, Named Entity Recognition, Multimodal

全文

PDF


Neural Networks Incorporating Dictionaries for Chinese Word Segmentation

摘要

In recent years, deep neural networks have achieved significant success in Chinese word segmentation and many other natural language processing tasks. Most of these algorithms are end-to-end trainable systems and can effectively process and learn from large scale labeled datasets. However, these methods typically lack the capability of processing rare words and data whose domains are different from training data. Previous statistical methods have demonstrated that human knowledge can provide valuable information for handling rare cases and domain shifting problems. In this paper, we seek to address the problem of incorporating dictionaries into neural networks for the Chinese word segmentation task. Two different methods that extend the bi-directional long short-term memory neural network are proposed to perform the task. To evaluate the performance of the proposed methods, state-of-the-art supervised models based methods and domain adaptation approaches are compared with our methods on nine datasets from different domains. The experimental results demonstrate that the proposed methods can achieve better performance than other state-of-the-art neural network methods and domain adaptation approaches in most cases.

关键词

Chinese word segmentation; Deep Learning

全文

PDF


Addressee and Response Selection in Multi-Party Conversations With Speaker Interaction RNNs

摘要

In this paper, we study the problem of addressee and response selection in multi-party conversations. Understanding multi-party conversations is challenging because of complex speaker interactions: multiple speakers exchange messages with each other, playing different roles (sender, addressee, observer), and these roles vary across turns. To tackle this challenge, we propose the Speaker Interaction Recurrent Neural Network (SI-RNN). Whereas the previous state-of-the-art system updated speaker embeddings only for the sender, SI-RNN uses a novel dialog encoder to update speaker embeddings in a role-sensitive way. Additionally, unlike the previous work that selected the addressee and response separately, SI-RNN selects them jointly by viewing the task as a sequence prediction problem. Experimental results show that SI-RNN significantly improves the accuracy of addressee and response selection, particularly in complex conversations with many speakers and responses to distant messages many turns in the past.

关键词

Dialog System;Speaker Embedding;Addressee and Response Selection

全文

PDF


Asynchronous Bidirectional Decoding for Neural Machine Translation

摘要

The dominant neural machine translation (NMT) models apply unified attentional encoder-decoder neural networks for translation. Traditionally, the NMT decoders adopt recurrent neural networks (RNNs) to perform translation in a left-to-right manner, leaving the target-side contexts generated from right to left unexploited during translation. In this paper, we equip the conventional attentional encoder-decoder NMT framework with a backward decoder, in order to explore bidirectional decoding for NMT. Attending to the hidden state sequence produced by the encoder, our backward decoder first learns to generate the target-side hidden state sequence from right to left. Then, the forward decoder performs translation in the forward direction, while in each translation prediction timestep, it simultaneously applies two attention models to consider the source-side and reverse target-side hidden states, respectively. With this new architecture, our model is able to fully exploit source- and target-side contexts to improve translation quality altogether. Experimental results on NIST Chinese-English and WMT English-German translation tasks demonstrate that our model achieves substantial improvements over the conventional NMT by 3.14 and 1.38 BLEU points, respectively. The source code of this work can be obtained from https://github.com/DeepLearnXMU/ABDNMT.

关键词

Machine Translation

全文

PDF


Medical Exam Question Answering with Large-scale Reading Comprehension

摘要

Reading and understanding text is one important component in computer aided diagnosis in clinical medicine, also being a major research problem in the field of NLP.  In this work, we introduce a question-answering task called MedQA to study answering questions in clinical medicine using knowledge in a large-scale document collection. The aim of MedQA is to answer real-world questions with large-scale reading comprehension. We propose our solution SeaReader---a modular end-to-end reading comprehension model based on LSTM networks and dual-path attention architecture. The novel dual-path attention models information flow from two perspectives and has the ability to simultaneously read individual documents and integrate information across multiple documents. In experiments our SeaReader achieved a large increase in accuracy on MedQA over competing models.  Additionally, we develop a series of novel techniques to demonstrate the interpretation of the question answering process in SeaReader.

关键词

reading comprehension; question answering; clinical medicine; aided diagnosis

全文

PDF


CoLink: An Unsupervised Framework for User Identity Linkage

摘要

Nowadays, it is very common for one person to be in different social networks. Linking identical users across different social networks, also known as the User Identity Linkage (UIL) problem, is fundamental for many applications. There are two major challenges in the UIL problem. First, it's extremely expensive to collect manually linked user pairs as training data. Second, the user attributes in different networks are usually defined and formatted very differently which makes attribute alignment very hard. In this paper we propose CoLink, a general unsupervised framework for the UIL problem. CoLink employs a co-training algorithm, which manipulates two independent models, the attribute-based model and the relationship-based model, and makes them reinforce each other iteratively in an unsupervised way. We also propose the sequence-to-sequence learning as a very effective implementation of the attribute-based model, which can well handle the challenge of the attribute alignment by treating it as a machine translation problem. We apply CoLink to a UIL task of mapping the employees in an enterprise network to their LinkedIn profiles. The experiment results show that CoLink generally outperforms the state-of-the-art unsupervised approaches by an F1 increase over 20%.

关键词

User Identity Linkage; Sequence-to-sequence learning

全文

PDF


Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation

摘要

Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate plausible responses with less satisfactory relevance and fluency. In this study, we aim to incorporate the results from linguistic analysis into the process of sentence generation for high-quality conversation generation. Specifically, we use a dependency parser to transform each response sentence into a dependency tree and construct a training corpus of sentence-tree pairs. A tree-structured decoder is developed to learn the mapping from a sentence to its tree, where different types of hidden states are used to depict the local dependencies from an internal tree node to its children. For training acceleration, we propose a tree canonicalization method, which transforms trees into equivalent ternary trees. Then, with a proposed tree-structured search method, the model is able to generate the most probable responses in the form of dependency trees, which are finally flattened into sequences as the system output. Experimental results demonstrate that the proposed X2Tree framework outperforms baseline methods over 11.15% increase of acceptance ratio.

关键词

tree-structured neural network; dialog generation

全文

PDF


Elastic Responding Machine for Dialog Generation with Dynamically Mechanism Selecting

摘要

Neural models aiming at generating meaningful and diverse response is attracting increasing attention over recent years. For a given post, the conventional encoder-decoder models tend to learn high-frequency but trivial responses, or are difficult to determine which speaking styles are suitable to generate responses. To address this issue, we propose the elastic responding machine (ERM), which is based on a proposed encoder-diverter-filter-decoder framework. ERM models the multiple responding mechanisms to not only generate acceptable responses for a given post but also improve the diversity of responses. Here, the mechanisms could be regraded as some latent variables, and for a given post different responses may be generated by different mechanisms. The experiments demonstrate the quality and diversity of the generated responses, intuitively show how the learned model controls response mechanism when responding, and reveal some underlying relationship between mechanism and language style.

关键词

dialog diversity; dialog generation; neural network

全文

PDF


RNN-Based Sequence-Preserved Attention for Dependency Parsing

摘要

Recurrent neural networks (RNN) combined with attention mechanism has proved to be useful for various NLP tasks including machine translation, sequence labeling and syntactic parsing. The attention mechanism is usually applied by estimating the weights (or importance) of inputs and taking the weighted sum of inputs as derived features. Although such features have demonstrated their effectiveness, they may fail to capture the sequence information due to the simple weighted sum being used to produce them. The order of the words does matter to the meaning or the structure of the sentences, especially for syntactic parsing, which aims to recover the structure from a sequence of words. In this study, we propose an RNN-based attention to capture the relevant and sequence-preserved features from a sentence, and use the derived features to perform the dependency parsing. We evaluated the graph-based and transition-based parsing models enhanced with the RNN-based sequence-preserved attention on the both English PTB and Chinese CTB datasets. The experimental results show that the enhanced systems were improved with significant increase in parsing accuracy.

关键词

全文

PDF


Generative Adversarial Network Based Heterogeneous Bibliographic Network Representation for Personalized Citation Recommendation

摘要

Network representation has been recently exploited for many applications, such as citation recommendation, multi-label classification and link prediction. It learns low-dimensional vector representation for each vertex in networks. Existing network representation methods only focus on incomplete aspects of vertex information (i.e., vertex content, network structure or partial integration), moreover they are commonly designed for homogeneous information networks where all the vertices of a network are of the same type. In this paper, we propose a deep network representation model that integrates network structure and the vertex content information into a unified framework by exploiting generative adversarial network, and represents different types of vertices in the heterogeneous network in a continuous and common vector space. Based on the proposed model, we can obtain heterogeneous bibliographic network representation for efficient citation recommendation. The proposed model also makes personalized citation recommendation possible, which is a new issue that a few papers addressed in the past. When evaluated on the AAN and DBLP datasets, the performance of the proposed heterogeneous bibliographic network based citation recommendation approach is comparable with that of the other network representation based citation recommendation approaches. The results also demonstrate that the personalized citation recommendation approach is more effective than the non-personalized citation recommendation approach.

关键词

Heterogeneous Bibliographic Network; Network Representation; Personalized Citation Recommendation; Generative Adversarial Network

全文

PDF


A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

摘要

We improve automatic correction of grammatical, orthographic, and collocation errors in text using a multilayer convolutional encoder-decoder neural network. The network is initialized with embeddings that make use of character N-gram information to better suit this task. When evaluated on common benchmark test data sets (CoNLL-2014 and JFLEG), our model substantially outperforms all prior neural approaches on this task as well as strong statistical machine translation-based systems with neural and task-specific features trained on the same data. Our analysis shows the superiority of convolutional neural networks over recurrent neural networks such as long short-term memory (LSTM) networks in capturing the local context via attention, and thereby improving the coverage in correcting grammatical errors. By ensembling multiple models, and incorporating an N-gram language model and edit features via rescoring, our novel method becomes the first neural approach to outperform the current state-of-the-art statistical machine translation-based approach, both in terms of grammaticality and fluency.

关键词

grammatical error correction

全文

PDF


Weakly Supervised Induction of Affective Events by Optimizing Semantic Consistency

摘要

To understand narrative text, we must comprehend how people are affected by the events that they experience. For example, readers understand that graduating from college is a positive event (achievement) but being fired from one's job is a negative event (problem). NLP researchers have developed effective tools for recognizing explicit sentiments, but affective events are more difficult to recognize because the polarity is often implicit and can depend on both a predicate and its arguments. Our research investigates the prevalence of affective events in a personal story corpus, and introduces a weakly supervised method for large scale induction of affective events. We present an iterative learning framework that constructs a graph with nodes representing events and initializes their affective polarities with sentiment analysis tools as weak supervision. The events are then linked based on three types of semantic relations: (1) semantic similarity, (2) semantic opposition, and (3) shared components. The learning algorithm iteratively refines the polarity values by optimizing semantic consistency across all events in the graph. Our model learns over 100,000 affective events and identifies their polarities more accurately than other methods.

关键词

Natural Language Processing; Sentiment Analysis; Affective Events

全文

PDF


Cross-Lingual Propagation for Deep Sentiment Analysis

摘要

Across the globe, people are voicing their opinion in social media and various other online fora. Given such data, modern deep learning-based sentiment analysis methods excel at determining the sentiment polarity of what is being said about companies, products, etc. Unfortunately, such deep methods require significant training data, while for many languages, resources and training data are scarce. In this work, we present a cross-lingual propagation algorithm that yields sentiment embedding vectors for numerous languages. We then rely on a dual-channel convolutional neural architecture to incorporate them into the network. This allows us to achieve gains in deep sentiment analysis across a range of languages and domains.

关键词

全文

PDF


Reinforcement Learning for Relation Classification From Noisy Data

摘要

Existing relation classification methods that rely on distant supervision assume that a bag of sentences mentioning an entity pair are all describing a relation for the entity pair. Such methods, performing classification at the bag level, cannot identify the mapping between a relation and a sentence, and largely suffers from the noisy labeling problem. In this paper, we propose a novel model for relation classification at the sentence level from noisy data. The model has two modules: an instance selector and a relation classifier. The instance selector chooses high-quality sentences with reinforcement learning and feeds the selected sentences into the relation classifier, and the relation classifier makes sentence-level prediction and provides rewards to the instance selector. The two modules are trained jointly to optimize the instance selection and relation classification processes.Experiment results show that our model can deal with the noise of data effectively and obtains better performance for relation classification at the sentence level.

关键词

Relation Classification; Reinforcement Learning; Data Denoise

全文

PDF


Twitter Summarization Based on Social Network and Sparse Reconstruction

摘要

With the rapid growth of microblogging services, such as Twitter, a vast of short and noisy messages are produced by millions of users, which makes people difficult to quickly grasp essential information of their interested topics. In this paper, we study extractive topic-oriented Twitter summarization as a solution to address this problem. Traditional summarization methods only consider text information, which is insufficient in social media situation. Existing Twitter summarization techniques rarely explore relations between tweets explicitly, ignoring that information can spread along the social network. Inspired by social theories that expression consistence and expression contagion are observed in social network, we propose a novel approach for Twitter summarization in short and noisy situation by integrating Social Network and Sparse Reconstruction (SNSR). We explore whether social relations can help Twitter summarization, modeling relations between tweets described as the social regularization and integrating it into the group sparse optimization framework. It conducts a sparse reconstruction process by selecting tweets that can best reconstruct the original tweets in a specific topic, with considering coverage and sparsity. We simultaneously design the diversity regularization to remove redundancy. In particular, we present a mathematical optimization formulation and develop an efficient algorithm to solve it. Due to the lack of public corpus, we construct the gold standard twitter summary datasets for 12 different topics. Experimental results on this datasets show the effectiveness of our framework for handling the large scale short and noisy messages in social media.

关键词

Twitter Summarization; Sparse Reconstruction; Social Network

全文

PDF


SEE: Syntax-Aware Entity Embedding for Neural Relation Extraction

摘要

Distant supervised relation extraction is an efficient approach to scale relation extraction to very large corpora, and has been widely used to find novel relational facts from plain text. Recent studies on neural relation extraction have shown great progress on this task via modeling the sentences in low-dimensional spaces, but seldom considered syntax information to model the entities. In this paper, we propose to learn syntax-aware entity embedding for neural relation extraction. First, we encode the context of entities on a dependency tree as sentence-level entity embedding based on tree-GRU. Then, we utilize both intra-sentence and inter-sentence attentions to obtain sentence set-level entity embedding over all sentences containing the focus entity pair. Finally, we combine both sentence embedding and entity embedding for relation classification. We conduct experiments on a widely used real-world dataset and the experimental results show that our model can make full use of all informative instances and achieve state-of-the-art performance of relation extraction.

关键词

全文

PDF


Semi-Distantly Supervised Neural Model for Generating Compact Answers to Open-Domain Why Questions

摘要

This paper proposes a neural network-based method for generating compact answers to open-domain why-questions (e.g., "Why was Mr. Trump elected as the president of the US?"). Unlike factoid question answering methods that provide short text spans as answers, existing work for why-question answering have aimed at answering questions by retrieving relatively long text passages, each of which often consists of several sentences, from a text archive. While the actual answer to a why-question may be expressed over several consecutive sentences, these often contain redundant and/or unrelated parts. Such answers would not be suitable for spoken dialog systems and smart speakers such as Amazon Echo, which receive much attention in these days. In this work, we aim at generating non-redundant compact answers to why-questions from answer passages retrieved from a very large web data corpora (4 billion web pages) by an already existing open-domain why-question answering system, using a novel neural network obtained by extending existing summarization methods. We also automatically generate training data using a large number of causal relations automatically extracted from 4 billion web pages by an existing supervised causality recognizer. The data is used to train our neural network, together with manually created training data. Through a series of experiments, we show that both our novel neural network and auto-generated training data improve the quality of the generated answers both in ROUGE score and in a subjective evaluation.

关键词

Summarization; Deep Learning; Neural Networks; Query; Question; Distant Supervision

全文

PDF


Task-Specific Representation Learning for Web-Scale Entity Disambiguation

摘要

Named entity disambiguation (NED) is a central problem in information extraction. The goal is to link entities in a knowledge graph (KG) to their mention spans in unstructured text. Each distinct mention span (like John Smith, Jordan or Apache) represents a multi-class classification task. NED can therefore be modeled as a multitask problem with tens of millions of tasks for realistic KGs. We initiate an investigation into neural representations, network architectures, and training protocols for multitask NED. Specifically, we propose a task-sensitive representation learning framework that learns mention dependent representations, followed by a common classifier. Parameter learning in our framework can be decomposed into solving multiple smaller problems involving overlapping groups of tasks. We prove bounds for excess risk, which provide additional insight into the problem of multi-task representation learning. While remaining practical in terms of training memory and time requirements, our approach outperforms recent strong baselines, on four benchmark data sets.

关键词

Information Extraction; Entity Disambiguation; Multitask Learning

全文

PDF


Byte-Level Machine Reading Across Morphologically Varied Languages

摘要

The machine reading task, where a computer reads a document and answers questions about it, is important in artificial intelligence research. Recently, many models have been proposed to address it. Word-level models, which have words as units of input and output, have proven to yield state-of-the-art results when evaluated on English datasets. However, in morphologically richer languages, many more unique words exist than in English due to highly productive prefix and suffix mechanisms. This may set back word-level models, since vocabulary sizes too big to allow for efficient computing may have to be employed. Multiple alternative input granularities have been proposed to avoid large input vocabularies, such as morphemes, character n-grams, and bytes. Bytes are advantageous as they provide a universal encoding format across languages, and allow for a small vocabulary size, which, moreover, is identical for every input language. In this work, we investigate whether bytes are suitable as input units across morphologically varied languages. To test this, we introduce two large-scale machine reading datasets in morphologically rich languages, Turkish and Russian. We implement 4 byte-level models, representing the major types of machine reading models and introduce a new seq2seq variant, called encoder-transformer-decoder. We show that, for all languages considered, there are models reading bytes outperforming the current state-of-the-art word-level baseline. Moreover, the newly introduced encoder-transformer-decoder performs best on the morphologically most involved dataset, Turkish. The large-scale Turkish and Russian machine reading datasets are released to public.

关键词

Machine Reading; Byte-level

全文

PDF


A Question-Focused Multi-Factor Attention Network for Question Answering

摘要

Neural network models recently proposed for question answering (QA) primarily focus on capturing the passage-question relation. However, they have minimal capability to link relevant facts distributed across multiple sentences which is crucial in achieving deeper understanding, such as performing multi-sentence reasoning, co-reference resolution, etc. They also do not explicitly focus on the question and answer type which often plays a critical role in QA. In this paper, we propose a novel end-to-end question-focused multi-factor attention network for answer extraction. Multi-factor attentive encoding using tensor-based transformation aggregates meaningful facts even when they are located in multiple sentences. To implicitly infer the answer type, we also propose a max-attentional question aggregation mechanism to encode a question vector based on the important words in a question. During prediction, we incorporate sequence-level encoding of the first wh-word and its immediately following word as an additional source of question type information. Our proposed model achieves significant improvements over the best prior state-of-the-art results on three large-scale challenging QA datasets, namely NewsQA, TriviaQA, and SearchQA.

关键词

question answering

全文

PDF


Training and Evaluating Improved Dependency-Based Word Embeddings

摘要

Word embedding has been widely used in many natural language processing tasks. In this paper, we focus on learning word embeddings through selective higher-order relationships in sentences to improve the embeddings to be less sensitive to local context and more accurate in capturing semantic compositionality. We present a novel multi-order dependency-based strategy to composite and represent the context under several essential constraints. In order to realize selective learning from the word contexts, we automatically assign the strengths of different dependencies between co-occurred words in the stochastic gradient descent process. We evaluate and analyze our proposed approach using several direct and indirect tasks for word embeddings. Experimental results demonstrate that our embeddings are competitive to or better than state-of-the-art methods and significantly outperform other methods in terms of context stability. The output weights and representations of dependencies obtained in our embedding model conform to most of the linguistic characteristics and are valuable for many downstream tasks.

关键词

Word embeddings; Dependency; Context composition

全文

PDF


Inference on Syntactic and Semantic Structures for Machine Comprehension

摘要

Hidden variable models are important tools for solving open domain machine comprehension tasks and have achieved remarkable accuracy in many question answering benchmark datasets. Existing models impose strong independence assumptions on hidden variables, which leaves the interaction among them unexplored. Here we introduce linguistic structures to help capturing global evidence in hidden variable modeling. In the proposed algorithms, question-answer pairs are scored based on structured inference results on parse trees and semantic frames, which aims to assign hidden variables in a global optimal way. Experiments on the MCTest dataset demonstrate that the proposed models are highly competitive with state-of-the-art machine comprehension systems.

关键词

全文

PDF


Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification

摘要

Cross-domain sentiment classification aims to leverage useful information in a source domain to help do sentiment classification in a target domain that has no or little supervised information. Existing cross-domain sentiment classification methods cannot automatically capture non-pivots, i.e., the domain-specific sentiment words, and pivots, i.e., the domain-shared sentiment words, simultaneously. In order to solve this problem, we propose a Hierarchical Attention Transfer Network (HATN) for cross-domain sentiment classification. The proposed HATN provides a hierarchical attention transfer mechanism which can transfer attentions for emotions across domains by automatically capturing pivots and non-pivots. Besides, the hierarchy of the attention mechanism mirrors the hierarchical structure of documents, which can help locate the pivots and non-pivots better. The proposed HATN consists of two hierarchical attention networks, with one named P-net aiming to find the pivots and the other named NP-net aligning the non-pivots by using the pivots as a bridge. Specifically, P-net firstly conducts individual attention learning to provide positive and negative pivots for NP-net. Then, P-net and NP-net conduct joint attention learning such that the HATN can simultaneously capture pivots and non-pivots and realize transferring attentions for emotions across domains. Experiments on the Amazon review dataset demonstrate the effectiveness of HATN.

关键词

全文

PDF


Dynamic User Profiling for Streams of Short Texts

摘要

In this paper, we aim at tackling the problem of dynamic user profiling in the context of streams of short texts. Profiling users' expertise in such context is more challenging than in the case of long documents in static collection as it is difficult to track users' dynamic expertise in streaming sparse data. To obtain better profiling performance, we propose a streaming profiling algorithm (SPA). SPA first utilizes the proposed user expertise tracking topic model (UET) to track the changes of users' dynamic expertise and then utilizes the proposed streaming keyword diversification algorithm (SKDA) to produce top-k diversified keywords for profiling users' dynamic expertise at a specific point in time. Experimental results validate the effectiveness of the proposed algorithms.

关键词

Expert Profiling; Expertise Retrieval; Topic Models; Twitter; Short Texts

全文

PDF


Multi-Task Medical Concept Normalization Using Multi-View Convolutional Neural Network

摘要

Medical concept normalization is a critical problem in biomedical research and clinical applications. In this paper, we focus on normalizing diagnostic and procedure names in Chinese discharge summaries to standard entities, which is formulated as a semantic matching problem. However, non-standard Chinese expressions, short-text normalization and heterogeneity of tasks pose critical challenges in our problem. This paper presents a general framework which introduces a tensor generator and a novel multi-view convolutional neural network (CNN) with multi-task shared structure to tackle the two tasks simultaneously. We propose that the key to address non-standard expressions and short-text problem is to incorporate a matching tensor with multiple granularities. Then multi-view CNN is adopted to extract semantic matching patterns and learn to synthesize them from different views. Finally, multi-task shared structure allows the model to exploit medical correlations between disease and procedure names to better perform disambiguation tasks. Comprehensive experimental analysis indicates our model outperforms existing baselines which demonstrates the effectiveness of our model.

关键词

全文

PDF


Targeted Aspect-Based Sentiment Analysis via Embedding Commonsense Knowledge into an Attentive LSTM

摘要

Analyzing people’s opinions and sentiments towards certain aspects is an important task of natural language understanding. In this paper, we propose a novel solution to targeted aspect-based sentiment analysis, which tackles the challenges of both aspect-based sentiment analysis and targeted sentiment analysis by exploiting commonsense knowledge. We augment the long short-term memory (LSTM) network with a hierarchical attention mechanism consisting of a target-level attention and a sentence-level attention. Commonsense knowledge of sentiment-related concepts is incorporated into the end-to-end training of a deep neural network for sentiment classification. In order to tightly integrate the commonsense knowledge into the recurrent encoder, we propose an extension of LSTM, termed Sentic LSTM. We conduct experiments on two publicly released datasets, which show that the combination of the proposed attention architecture and Sentic LSTM can outperform state-of-the-art methods in targeted aspect sentiment tasks.

关键词

Sentiment; Long-short-term Memory; Commonsense

全文

PDF


Cognition-Cognizant Sentiment Analysis With Multitask Subjectivity Summarization Based on Annotators' Gaze Behavior

摘要

For document level sentiment analysis (SA), Subjectivity Extraction, ie., extracting the relevant subjective portions of the text that cover the overall sentiment expressed in the document, is an important step. Subjectivity Extraction, however, is a hard problem for systems, as it demands a great deal of world knowledge and reasoning. Humans, on the other hand, are good at extracting relevant subjective summaries from an opinionated document (say, a movie review), while inferring the sentiment expressed in it. This capability is manifested in their eye-movement behavior while reading: words pertaining to the subjective summary of the text attract a lot more attention in the form of gaze-fixations and/or saccadic patterns. We propose a multi-task deep neural framework for document level sentiment analysis that learns to predict the overall sentiment expressed in the given input document, by simultaneously learning to predict human gaze behavior and auxiliary linguistic tasks like part-of-speech and syntactic properties of words in the document. For this, a multi-task learning algorithm based on multi-layer shared LSTM augmented with task specific classifiers is proposed. With this composite multi-task network, we obtain performance competitive with or better than state-of-the-art approaches in SA. Moreover, the availability of gaze predictions as an auxiliary output helps interpret the system better; for instance, gaze predictions reveal that the system indeed performs subjectivity extraction better, which accounts for improvement in document level sentiment analysis performance.

关键词

Sentiment Analysis; Cognitive Natural Language Processing; Multitask Sentiment Analysis; Eye-tracking for Sentiment Analysis;Eye-tracking;Gaze Behavior;

全文

PDF


Argument Mining for Improving the Automated Scoring of Persuasive Essays

摘要

End-to-end argument mining has enabled the development of new automated essay scoring (AES) systems that use argumentative features (e.g., number of claims, number of support relations) in addition to traditional legacy features (e.g., grammar, discourse structure) when scoring persuasive essays. While prior research has proposed different argumentative features as well as empirically demonstrated their utility for AES, these studies have all had important limitations. In this paper we identify a set of desiderata for evaluating the use of argument mining for AES, introduce an end-to-end argument mining system and associated argumentative feature sets, and present the results of several studies that both satisfy the desiderata and demonstrate the value-added of argument mining for scoring persuasive essays.

关键词

argument mining, automated essay scoring

全文

PDF


Graph Convolutional Networks With Argument-Aware Pooling for Event Detection

摘要

The current neural network models for event detection have only considered the sequential representation of sentences. Syntactic representations have not been explored in this area although they provide an effective mechanism to directly link words to their informative context for event detection in the sentences. In this work, we investigate a convolutional neural network based on dependency trees to perform event detection. We propose a novel pooling method that relies on entity mentions to aggregate the convolution vectors. The extensive experiments demonstrate the benefits of the dependency-based convolutional neural networks and the entity mention-based pooling method for event detection. We achieve the state-of-the-art performance on widely used datasets with both perfect and predicted entity mentions.

关键词

Event Detection; Information Extraction; Deep Learning; Graph Convolutional Neural Networks

全文

PDF


Mention and Entity Description Co-Attention for Entity Disambiguation

摘要

For the task of entity disambiguation, mention contexts and entity descriptions both contain various kinds of information content while only a subset of them are helpful for disambiguation. In this paper, we propose a type-aware co-attention model for entity disambiguation, which tries to identify the most discriminative words from mention contexts and most relevant sentences from corresponding entity descriptions simultaneously. To bridge the semantic gap between mention contexts and entity descriptions, we further incorporate entity type information to enhance the co-attention mechanism. Our evaluation shows that the proposed model outperforms the state-of-the-arts on three public datasets. Further analysis also confirms that both the co-attention mechanism and the type-aware mechanism are effective.

关键词

Entity disambiguation, Neural Networks

全文

PDF


Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction

摘要

Event extraction plays an important role in natural language processing (NLP) applications including question answering and information retrieval. Traditional event extraction relies heavily on lexical and syntactic features, which require intensive human engineering and may not generalize to different datasets. Deep neural networks, on the other hand, are able to automatically learn underlying features, but existing networks do not make full use of syntactic relations. In this paper, we propose a novel dependency bridge recurrent neural network (dbRNN) for event extraction. We build our model upon a recurrent neural network, but enhance it with dependency bridges, which carry syntactically related information when modeling each word.We illustrates that simultaneously applying tree structure and sequence structure in RNN brings much better performance than only uses sequential RNN. In addition, we use a tensor layer to simultaneously capture the various types of latent interaction between candidate arguments as well as identify/classify all arguments of an event. Experiments show that our approach achieves competitive results compared with previous work.

关键词

event; extraction; dependency bridge; tensor; joint

全文

PDF


Content and Context: Two-Pronged Bootstrapped Learning for Regex-Formatted Entity Extraction

摘要

Regular expressions are an important building block of rule-based information extraction systems. Regexes can encode rules to recognize instances of simple entities which can then feed into the identification of more complex cross-entity relationships. Manually crafting a regex that recognizes all possible instances of an entity is difficult since an entity can manifest in a variety of different forms. Thus, the problem of automatically generalizing manually crafted seed regexes to improve the recall of IE systems has attracted research attention. In this paper, we propose a bootstrapped approach to improve the recall for extraction of regex-formatted entities, with the only source of supervision being the seed regex. Our approach starts from a manually authored high precision seed regex for the entity of interest, and uses the matches of the seed regex and the context around these matches to identify more instances of the entity. These are then used to identify a set of diverse, high recall regexes that are representative of this entity. Through an empirical evaluation over multiple real world document corpora, we illustrate the effectiveness of our approach.

关键词

Information Extraction, Entity Extraction, Rule-based Entity Extraction, Regular Expressions, Bootstrapping, Set Expansion

全文

PDF


Towards a Neural Conversation Model With Diversity Net Using Determinantal Point Processes

摘要

Typically, neural conversation systems generate replies based on the sequence-to-sequence (seq2seq) model. seq2seq tends to produce safe and universal replies, which suffers from the lack of diversity and information. Determinantal Point Processes (DPPs) is a probabilistic model defined on item sets, which can select the items with good diversity and quality. In this paper, we investigate the diversity issue in two different aspects, namely query-level and system-level diversity. We propose a novel framework which organically combines seq2seq model with Determinantal Point Processes (DPPs). The new framework achieves high quality in generated reply and significantly improves the diversity among them. Experiments show that our model achieves the best performance among various baselines in terms of both quality and diversity.

关键词

conversation systems; diversity; DPPs;

全文

PDF


S-Net: From Answer Extraction to Answer Synthesis for Machine Reading Comprehension

摘要

In this paper, we present a novel approach to machine reading comprehension for the MS-MARCO dataset. Unlike the SQuAD dataset that aims to answer a question with exact text spans in a passage, the MS-MARCO dataset defines the task as answering a question from multiple passages and the words in the answer are not necessary in the passages. We therefore develop an extraction-then-synthesis framework to synthesize answers from extraction results. Specifically, the answer extraction model is first employed to predict the most important sub-spans from the passage as evidence, and the answer synthesis model takes the evidence as additional features along with the question and passage to further elaborate the final answers. We build the answer extraction model with state-of-the-art neural networks for single passage reading comprehension, and propose an additional task of passage ranking to help answer extraction in multiple passages. The answer synthesis model is based on the sequence-to-sequence neural networks with extracted evidences as features. Experiments show that our extraction-then-synthesis method outperforms state-of-the-art methods.

关键词

全文

PDF


SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring

摘要

Deep learning has demonstrated tremendous potential for Automatic Text Scoring (ATS) tasks. In this paper, we describe a new neural architecture that enhances vanilla neural network models with auxiliary neural coherence features. Our new method proposes a new SkipFlow mechanism that models relationships between snapshots of the hidden representations of a long short-term memory (LSTM) network as it reads. Subsequently, the semantic relationships between multiple snapshots are used as auxiliary features for prediction. This has two main benefits. Firstly, essays are typically long sequences and therefore the memorization capability of the LSTM network may be insufficient. Implicit access to multiple snapshots can alleviate this problem by acting as a protection against vanishing gradients. The parameters of the SkipFlow mechanism also acts as an auxiliary memory. Secondly, modeling relationships between multiple positions allows our model to learn features that represent and approximate textual coherence. In our model, we call this neural coherence features. Overall, we present a unified deep learning architecture that generates neural coherence features as it reads in an end-to-end fashion. Our approach demonstrates state-of-the-art performance on the benchmark ASAP dataset, outperforming not only feature engineering baselines but also other deep learning models.

关键词

Essay Grading; Educational AI; LSTM; Deep Learning; ASAP Dataset; Kaggle; Essay

全文

PDF


Learning to Attend via Word-Aspect Associative Fusion for Aspect-Based Sentiment Analysis

摘要

Aspect-based sentiment analysis (ABSA) tries to predict the polarity of a given document with respect to a given aspect entity. While neural network architectures have been successful in predicting the overall polarity of sentences, aspect-specific sentiment analysis still remains as an open problem. In this paper, we propose a novel method for integrating aspect information into the neural model. More specifically, we incorporate aspect information into the neural model by modeling word-aspect relationships. Our novel model, Aspect Fusion LSTM (AF-LSTM) learns to attend based on associative relationships between sentence words and aspect which allows our model to adaptively focus on the correct words given an aspect term. This ameliorates the flaws of other state-of-the-art models that utilize naive concatenations to model word-aspect similarity. Instead, our model adopts circular convolution and circular correlation to model the similarity between aspect and words and elegantly incorporates this within a differentiable neural attention framework. Finally, our model is end-to-end differentiable and highly related to convolution-correlation (holographic like) memories. Our proposed neural model achieves state-of-the-art performance on benchmark datasets, outperforming ATAE-LSTM by 4%-5% on average across multiple datasets.

关键词

Aspect-based; Aspect-level; Neural Networks; Attention Mechanism; Learning to Attend; Sentiment Analysis; Aspect-based Sentiment Analysis; LSTM; Recurrent Neural Network

全文

PDF


Investigating Inner Properties of Multimodal Representation and Semantic Compositionality With Brain-Based Componential Semantics

摘要

Multimodal models have been proven to outperform text-based approaches on learning semantic representations. However, it still remains unclear what properties are encoded in multimodal representations, in what aspects do they outperform the single-modality representations, and what happened in the process of semantic compositionality in different input modalities. Considering that multimodal models are originally motivated by human concept representations, we assume that correlating multimodal representations with brain-based semantics would interpret their inner properties to answer the above questions. To that end, we propose simple interpretation methods based on brain-based componential semantics. First we investigate the inner properties of multimodal representations by correlating them with corresponding brain-based property vectors. Then we map the distributed vector space to the interpretable brain-based componential space to explore the inner properties of semantic compositionality. Ultimately, the present paper sheds light on the fundamental questions of natural language understanding, such as how to represent the meaning of words and how to combine word meanings into larger units.

关键词

Multimodal model; brain-based componential semantics; word representation; image representation

全文

PDF


Learning Multimodal Word Representation via Dynamic Fusion Methods

摘要

Multimodal models have been proven to outperform text-based models on learning semantic word representations. Almost all previous multimodal models typically treat the representations from different modalities equally. However, it is obvious that information from different modalities contributes differently to the meaning of words. This motivates us to build a multimodal model that can dynamically fuse the semantic representations from different modalities according to different types of words. To that end, we propose three novel dynamic fusion methods to assign importance weights to each modality, in which weights are learned under the weak supervision of word association pairs. The extensive experiments have demonstrated that the proposed methods outperform strong unimodal baselines and state-of-the-art multimodal models.

关键词

Multimodal word representations; dynamic fusion; word associations

全文

PDF


R3: Reinforced Ranker-Reader for Open-Domain Question Answering

摘要

In recent years researchers have achieved considerable success applying neural network methods to question answering (QA). These approaches have achieved state of the art results in simplified closed-domain settings such as the SQuAD (Rajpurkar et al. 2016) dataset, which provides a pre-selected passage, from which the answer to a given question may be extracted. More recently, researchers have begun to tackle open-domain QA, in which the model is given a question and access to a large corpus (e.g., wikipedia) instead of a pre-selected passage (Chen et al. 2017a). This setting is more complex as it requires large-scale search for relevant passages by an information retrieval component, combined with a reading comprehension model that “reads” the passages to generate an answer to the question. Performance in this setting lags well behind closed-domain performance. In this paper, we present a novel open-domain QA system called Reinforced Ranker-Reader (R3), based on two algorithmic innovations. First, we propose a new pipeline for open-domain QA with a Ranker component, which learns to rank retrieved passages in terms of likelihood of extracting the ground-truth answer to a given question. Second, we propose a novel method that jointly trains the Ranker along with an answer-extraction Reader model, based on reinforcement learning. We report extensive experimental results showing that our method significantly improves on the state of the art for multiple open-domain QA datasets.

关键词

Question Answering;Reinforcement learning;Deep Learning;QA;RL;DL

全文

PDF


Improving Review Representations With User Attention and Product Attention for Sentiment Classification

摘要

Neural network methods have achieved great success in reviews sentiment classification. Recently, some works achieved improvement by incorporating user and product information to generate a review representation. However, in reviews, we observe that some words or sentences show strong user's preference, and some others tend to indicate product's characteristic. The two kinds of information play different roles in determining the sentiment label of a review. Therefore, it is not reasonable to encode user and product information together into one representation. In this paper, we propose a novel framework to encode user and product information. Firstly, we apply two individual hierarchical neural networks to generate two representations, with user attention or with product attention. Then, we design a combined strategy to make full use of the two representations for training and final prediction. The experimental results show that our model obviously outperforms other state-of-the-art methods on IMDB and Yelp datasets. Through the visualization of attention over words related to user or product, we validate our observation mentioned above.

关键词

Neural network; Attention; Sentiment classification

全文

PDF


Improving Neural Fine-Grained Entity Typing With Knowledge Attention

摘要

Fine-grained entity typing aims to identify the semantic type of an entity in a particular plain text. It is an important task which can be helpful for a lot of natural language processing (NLP) applications. Most existing methods typically extract features separately from the entity mention and context words for type classification. These methods inevitably fail to model complex correlations between entity mentions and context words. They also neglect rich background information about these entities in knowledge bases (KBs). To address these issues, we take information from KBs into consideration to bridge entity mentions and their context together, and thereby propose Knowledge-Attention Neural Fine-Grained Entity Typing. Experimental results and case studies on real-world datasets demonstrate that our model significantly outperforms other state-of-the-art methods, revealing the effectiveness of incorporating KB information for entity typing. Code and data for this paper can be found at https://github.com/thunlp/KNET.

关键词

Entity Typing; Attention; Knowledge Base

全文

PDF


Diagnosing and Improving Topic Models by Analyzing Posterior Variability

摘要

Bayesian inference methods for probabilistic topic models can quantify uncertainty in the parameters, which has primarily been used to increase the robustness of parameter estimates. In this work, we explore other rich information that can be obtained by analyzing the posterior distributions in topic models. Experimenting with latent Dirichlet allocation on two datasets, we propose ideas incorporating information about the posterior distributions at the topic level and at the word level. At the topic level, we propose a metric called topic stability that measures the variability of the topic parameters under the posterior. We show that this metric is correlated with human judgments of topic quality as well as with the consistency of topics appearing across multiple models. At the word level, we experiment with different methods for adjusting individual word probabilities within topics based on their uncertainty. Humans prefer words ranked by our adjusted estimates nearly twice as often when compared to the traditional approach. Finally, we describe how the ideas presented in this work could potentially applied to other predictive or exploratory models in future work.

关键词

全文

PDF


Dual Attention Network for Product Compatibility and Function Satisfiability Analysis

摘要

Product compatibility and functionality are of utmost importance to customers when they purchase products, and to sellers and manufacturers when they sell products. Due to the huge number of products available online, it is infeasible to enumerate and test the compatibility and functionality of every product. In this paper, we address two closely related problems: product compatibility analysis and function satisfiability analysis, where the second problem is a generalization of the first problem (e.g., whether a product works with another product can be considered as a special function). We first identify a novel question and answering corpus that is up-to-date regarding product compatibility and functionality information. To allow automatic discovery product compatibility and functionality, we then propose a deep learning model called Dual Attention Network (DAN). Given a QA pair for a to-be-purchased product, DAN learns to 1) discover complementary products (or functions), and 2) accurately predict the actual compatibility (or satisfiability) of the discovered products (or functions). The challenges addressed by the model include the briefness of QAs, linguistic patterns indicating compatibility, and the appropriate fusion of questions and answers. We conduct experiments to quantitatively and qualitatively show that the identified products and functions have both high coverage and accuracy, compared with a wide spectrum of baselines.

关键词

deep learning; question and answering; sentiment analysis; compatibility analysis; function satisfiability analysis

全文

PDF


Assertion-Based QA With Question-Aware Open Information Extraction

摘要

We present assertion based question answering (ABQA), an open domain question answering task that takes a question and a passage as inputs, and outputs a semi-structured assertion consisting of a subject, a predicate and a list of arguments. An assertion conveys more evidences than a short answer span in reading comprehension, and it is more concise than a tedious passage in passage-based QA. These advantages make ABQA more suitable for human-computer interaction scenarios such as voice-controlled speakers. Further progress towards improving ABQA requires richer supervised dataset and powerful models of text understanding. To remedy this, we introduce a new dataset called WebAssertions, which includes hand-annotated QA labels for 358,427 assertions in 55,960 web passages. To address ABQA, we develop both generative and extractive approaches. The backbone of our generative approach is sequence to sequence learning. In order to capture the structure of the output assertion, we introduce a hierarchical decoder that first generates the structure of the assertion and then generates the words of each field. The extractive approach is based on learning to rank. Features at different levels of granularity are designed to measure the semantic relevance between a question and an assertion. Experimental results show that our approaches have the ability to infer question-aware assertions from a passage. We further evaluate our approaches by incorporating the ABQA results as additional features in passage-based QA. Results on two datasets show that ABQA features significantly improve the accuracy on passage-based QA.

关键词

Information Extraction; Question Answering

全文

PDF


Multi-Entity Aspect-Based Sentiment Analysis With Context, Entity and Aspect Memory

摘要

Inspired by recent works in Aspect-Based Sentiment Analysis (ABSA) on product reviews and faced with more complex posts on social media platforms mentioning multiple entities as well as multiple aspects, we define a novel task called Multi-Entity Aspect-Based Sentiment Analysis (ME-ABSA). This task aims at fine-grained sentiment analysis of (entity, aspect) combinations, making the well-studied ABSA task a special case of it. To address the task, we propose an innovative method that models Context memory, Entity memory and Aspect memory, called CEA method. Our experimental results show that our CEA method achieves a significant gain over several baselines, including the state-of-the-art method for the ABSA task, and their enhanced versions, on datasets for ME-ABSA and ABSA tasks. The in-depth analysis illustrates the significant advantage of the CEA method over baseline methods for several hard-to-predict post types. Furthermore, we show that the CEA method is capable of generalizing to new (entity, aspect) combinations with little loss of accuracy. This observation indicates that data annotation in real applications can be largely simplified.

关键词

Multi-Entity Aspect-Based Sentiment Analysis; Opinion Mining

全文

PDF


OTyper: A Neural Architecture for Open Named Entity Typing

摘要

Named Entity Typing (NET) is valuable for many natural language processing tasks, such as relation extraction, question answering, knowledge base population, and co-reference resolution. Classical NET targeted a few coarse-grained types, but the task has expanded to sets of hundreds of types in recent years. Existing work in NET assumes that the target types are specified in advance, and that hand-labeled examples of each type are available. In this work, we introduce the task of Open Named Entity Typing (ONET), which is NET when the set of target types is not known in advance. We propose a neural network architecture for ONET, called OTyper, and evaluate its ability to tag entities with types not seen in training. On the benchmark FIGER(GOLD) dataset, OTyper achieves a weighted AUC-ROC score of 0.870 on unseen types, substantially outperforming pattern- and embedding-based baselines.

关键词

Named Entity Typing; Open domain; Zero-shot learning

全文

PDF


Scale Up Event Extraction Learning via Automatic Training Data Generation

摘要

The task of event extraction has long been investigated in a supervised learning paradigm, which is bound by the number and the quality of the training instances. Existing training data must be manually generated through a combination of expert domain knowledge and extensive human involvement. However, due to drastic efforts required in annotating text, the resultant datasets are usually small, which severally affects the quality of the learned model, making it hard to generalize. Our work develops an automatic approach for generating training data for event extraction. Our approach allows us to scale up event extraction training instances from thousands to hundreds of thousands, and it does this at a much lower cost than a manual approach. We achieve this by employing distant supervision to automatically create event annotations from unlabelled text using existing structured knowledge bases or tables.We then develop a neural network model with post inference to transfer the knowledge extracted from structured knowledge bases to automatically annotate typed events with corresponding arguments in text.We evaluate our approach by using the knowledge extracted from Freebase to label texts from Wikipedia articles. Experimental results show that our approach can generate a large number of highquality training instances. We show that this large volume of training data not only leads to a better event extractor, but also allows us to detect multiple typed events.

关键词

全文

PDF


Learning Structured Representation for Text Classification via Reinforcement Learning

摘要

Representation learning is a fundamental problem in natural language processing. This paper studies how to learn a structured representation for text classification. Unlike most existing representation models that either use no structure or rely on pre-specified structures, we propose a reinforcement learning (RL) method to learn sentence representation by discovering optimized structures automatically. We demonstrate two attempts to build structured representation: Information Distilled LSTM (ID-LSTM) and Hierarchically Structured LSTM (HS-LSTM). ID-LSTM selects only important, task-relevant words, and HS-LSTM discovers phrase structures in a sentence. Structure discovery in the two representation models is formulated as a sequential decision problem: current decision of structure discovery affects following decisions, which can be addressed by policy gradient RL. Results show that our method can learn task-friendly representations by identifying important words or task-relevant structures without explicit structure annotations, and thus yields competitive performance.

关键词

Reinforcement Learning; Structure; Text Classification

全文

PDF


Duplicate Question Identification by Integrating FrameNet With Neural Networks

摘要

There are two major problems in duplicate question identification, namely lexical gap and essential constituents matching. Previous methods either design various similarity features or learn representations via neural networks, which try to solve the lexical gap but neglect the essential constituents matching. In this paper, we focus on the essential constituents matching problem and use FrameNet-style semantic parsing to tackle it. Two approaches are proposed to integrate FrameNet parsing with neural networks. An ensemble approach combines a traditional model with manually designed features and a neural network model. An embedding approach converts frame parses to embeddings, which are combined with word embeddings at the input of neural networks. Experiments on Quora question pairs dataset demonstrate that the ensemble approach is more effective and outperforms all baselines.

关键词

Duplicate Question Identification; FrameNet

全文

PDF


Variational Reasoning for Question Answering With Knowledge Graph

摘要

Knowledge graph (KG) is known to be helpful for the task of question answering (QA), since it provides well-structured relational information between entities, and allows one to further infer indirect facts. However, it is challenging to build QA systems which can learn to reason over knowledge graphs based on question-answer pairs alone. First, when people ask questions, their expressions are noisy (for example, typos in texts, or variations in pronunciations), which is non-trivial for the QA system to match those mentioned entities to the knowledge graph. Second, many questions require multi-hop logic reasoning over the knowledge graph to retrieve the answers. To address these challenges, we propose a novel and unified deep learning architecture, and an end-to-end variational learning algorithm which can handle noise in questions, and learn multi-hop reasoning simultaneously. Our method achieves state-of-the-art performance on a recent benchmark dataset in the literature. We also derive a series of new benchmark datasets, including questions for multi-hop reasoning, questions paraphrased by neural translation model, and questions in human voice. Our method yields very promising results on all these challenging datasets.

关键词

Question Answering

全文

PDF


Hierarchical Attention Flow for Multiple-Choice Reading Comprehension

摘要

In this paper, we focus on multiple-choice reading comprehension which aims to answer a question given a passage and multiple candidate options. We present the hierarchical attention flow to adequately leverage candidate options to model the interactions among passages, questions and candidate options. We observe that leveraging candidate options to boost evidence gathering from the passages play a vital role in this task, which is ignored in previous works. In addition, we explicitly model the option correlations with attention mechanism to obtain better option representations, which are further fed into a bilinear layer to obtain the ranking score for each option. On a large-scale multiple-choice reading comprehension dataset (i.e. the RACE dataset), the proposed model outperforms two previous neural network baselines on both RACE-M and RACE-H subsets and yields the state-of-the-art overall results.

关键词

reading comprehension; deep neural network;

全文

PDF


Resource-Constrained Scheduling for Maritime Traffic Management

摘要

We address the problem of mitigating congestion and preventing hotspots in busy water areas such as Singapore Straits and port waters. Increasing maritime traffic coupled with narrow waterways makes vessel schedule coordination for just-in-time arrival critical for navigational safety. Our contributions are: 1) We formulate the maritime traffic management problem based on the real case study of Singapore waters; 2) We model the problem as a variant of the resource-constrained project scheduling problem (RCPSP), and formulate mixed-integer and constraint programming (MIP/CP) formulations; 3) To improve the scalability, we develop a combinatorial Benders (CB) approach that is significantly more effective than standard MIP and CP formulations. We also develop symmetry breaking constraints and optimality cuts that further enhance the CB approach's effectiveness; 4) We develop a realistic maritime traffic simulator using electronic navigation charts of Singapore Straits. Our scheduling approach on synthetic problems and a real 55-day AIS dataset results in significant reduction of the traffic density while incurring minimal delays.

关键词

Benders Decomposition

全文

PDF


Classical Planning in Deep Latent Space: Bridging the Subsymbolic-Symbolic Boundary

摘要

Current domain-independent, classical planners require symbolic models of the problem domain and instance as input, resulting in a knowledge acquisition bottleneck. Meanwhile, although deep learning has achieved significant success in many fields, the knowledge is encoded in a subsymbolic representation which is incompatible with symbolic systems such as planners. We propose LatPlan, an unsupervised architecture combining deep learning and classical planning. Given only an unlabeled set of image pairs showing a subset of transitions allowed in the environment (training inputs), and a pair of images representing the initial and the goal states (planning inputs), LatPlan finds a plan to the goal state in a symbolic latent space and returns a visualized plan execution. The contribution of this paper is twofold: (1) State Autoencoder, which finds a propositional state representation of the environment using a Variational Autoencoder. It generates a discrete latent vector from the images, based on which a PDDL model can be constructed and then solved by an off-the-shelf planner. (2) Action Autoencoder / Discriminator, a neural architecture which jointly finds the action symbols and the implicit action models (preconditions/effects), and provides a successor function for the implicit graph search. We evaluate LatPlan using image-based versions of 3 planning domains: 8-puzzle, Towers of Hanoi and LightsOut.

关键词

Classical Planning; Deep Learning; Symbol Grounding

全文

PDF


Planning With Pixels in (Almost) Real Time

摘要

Recently, width-based planning methods have been shown to yield state-of-the-art results in the Atari 2600 video games. For this, the states were associated with the (RAM) memory states of the simulator. In this work, we consider the same planning problem but using the screen instead. By using the same visual inputs, the planning results can be compared with those of humans and learning methods. We show that the planning approach, out of the box and without training, results in scores that compare well with those obtained by humans and learning methods, and moreover, by developing an episodic, rollout version of the IW(k) algorithm, we show that such scores can be obtained in almost real time.

关键词

Planning; Width-based Search; Rollouts; Atari; Games

全文

PDF


totSAT - Totally-Ordered Hierarchical Planning Through SAT

摘要

In this paper, we propose a novel SAT-based planning approach for hierarchical planning by introducing the SAT-based planner totSAT for the class of totally-ordered HTN planning problems. We use the same general approach as SAT planning for classical planning does: bound the problem, translate the problem into a formula, and if the formula is not satisfiable, increase the bound. In HTN planning, a suitable bound is the maximum depth of decomposition. We show how totally-ordered HTN planning problems can be translated into a SAT formula, given this bound. Furthermore, we have conducted an extensive empirical evaluation to compare our new planner against state-of-the-art HTN planners. It shows that our technique outperforms any of these systems.

关键词

HTN Planning; SAT-based Planning; Totally-Ordered HTN Planning

全文

PDF


Sublinear Search Spaces for Shortest Path Planning in Grid and Road Networks

摘要

Shortest path planning is a fundamental building block in many applications. Hence developing efficient methods for computing shortest paths in e.g. road or grid networks is an important challenge. The most successful techniques for fast query answering rely on preprocessing. But for many of these techniques it is not fully understood why they perform so remarkably well and theoretical justification for the empirical results is missing. An attempt to explain the excellent practical performance of preprocessing based techniques on road networks (as transit nodes, hub labels, or contraction hierarchies) in a sound theoretical way are parametrized analyses, e.g., considering the highway dimension or skeleton dimension of a graph. But these parameters tend to be large (order of Θ(√n)) when the network contains grid-like substructures — which inarguably is the case for real-world road networks around the globe. In this paper, we use the very intuitive notion of bounded growth graphs to describe road networks and also grid graphs. We show that this model suffices to prove sublinear search spaces for the three above mentioned state-of-the-art shortest path planning techniques. For graphs with a large highway or skeleton dimension, our results turn out to be superior. Furthermore, our preprocessing methods are close to the ones used in practice and only require randomized polynomial time.

关键词

bounded growth; contraction hierarchies; transit nodes; hub labels

全文

PDF


Scheduling in Visual Fog Computing: NP-Completeness and Practical Efficient Solutions

摘要

The visual fog paradigm envisions tens of thousands of heterogeneous, camera-enabled edge devices distributed across the Internet, providing live sensing for a myriad of different visual processing applications. The scale, computational demands, and bandwidth needed for visual computing pipelines necessitates offloading intelligently to distributed computing infrastructure, including the cloud, Internet gateway devices, and the edge devices themselves. This paper focuses on the visual fog scheduling problem of assigning the visual computing tasks to various devices to optimize network utilization. We first prove this problem is NP-complete, and then formulate a practical, efficient solution. We demonstrate sub-minute computation time to optimally schedule 20,000 tasks across over 7,000 devices, and just 7-minute execution time to place 60,000 tasks across 20,000 devices, showing our approach is ready to meet the scale challenges introduced by visual fog.

关键词

scheduling; visual fog; artificial intelligence

全文

PDF


Fat- and Heavy-Tailed Behavior in Satisficing Planning

摘要

In this work, we study the runtime distribution of satisficing planning in ensembles of random planning problems and in multiple runs of a randomized heuristic search on a single planning instance. Using common heuristic functions (such as FF) and six benchmark problem domains from the IPC, we find a heavy-tailed behavior, similar to that found in CSP and SAT. We investigate two notions of constrainedness, often used in the modeling of planning problems, and show that the heavy-tailed behavior tends to appear in relatively relaxed problems, where the required effort is, on average, low. Finally, we show that as with randomized restarts in CSP and SAT solving, recent search enhancements that incorporate randomness in the search process can help mitigate the effect of the heavy tail.

关键词

Satisficing Planning; GBFS; Heavy-tailed

全文

PDF


Finite Sample Analyses for TD(0) With Function Approximation

摘要

TD(0) is one of the most commonly used algorithms in reinforcement learning. Despite this, there is no existing finite sample analysis for TD(0) with function approximation, even for the linear case. Our work is the first to provide such results. Existing convergence rates for Temporal Difference (TD) methods apply only to somewhat modified versions, e.g., projected variants or ones where stepsizes depend on unknown problem parameters. Our analyses obviate these artificial alterations by exploiting strong properties of TD(0). We provide convergence rates both in expectation and with high-probability. The two are obtained via different approaches that use relatively unknown, recently developed stochastic approximation techniques.

关键词

Reinforcement Learning; TD Learning; Stochastic Approximation; Online Learning

全文

PDF


Synthesis of Orchestrations of Transducers for Manufacturing

摘要

In this paper, we model manufacturing processes and facilities as transducers (automata with output). The problem of whether a given manufacturing process can be realized by a given set of manufacturing resources can then be stated as an orchestration problem for transducers. We first consider the conceptually simpler case of uni-transducers (transducers with a single input and a single output port), and show that synthesizing orchestrations for uni-transducers is EXPTIME-complete. Surprisingly, the complexity remains the same for the more expressive multi-transducer case, where transducers have multiple input and output ports and the orchestration is in charge of dynamically connecting ports during execution.

关键词

全文

PDF


Meta-Search Through the Space of Representations and Heuristics on a Problem by Problem Basis

摘要

Two key aspects of problem solving are representation and search heuristics. Both theoretical and experimental studies have shown that there is no one best problem representation nor one best search heuristic. Therefore, some recent methods, e.g., portfolios, learn a good combination of problem solvers to be used in a given domain or set of domains. There are even dynamic portfolios that select a particular combination of problem solvers specific to a problem. These approaches: (1) need to perform a learning step; (2) do not usually focus on changing the representation of the input domain/problem; and (3) frequently do not adapt the portfolio to the specific problem. This paper describes a meta-reasoning system that searches through the space of combinations of representations and heuristics to find one suitable for optimally solving the specific problem. We show that this approach can be better than selecting a combination to use for all problems within a domain and is competitive with state of the art optimal planners.

关键词

online meta-search; representation change

全文

PDF


Expressive Real-Time Intersection Scheduling

摘要

We present Expressive Real-time Intersection Scheduling (ERIS), a schedule-driven control strategy for adaptive intersection control to reduce traffic congestion. ERIS maintains separate estimates for each lane approaching a traffic intersection allowing it to more accurately estimate the effects of scheduling decisions than previous schedule-driven approaches. We present a detailed description of the search space and A* search heuristic employed by ERIS to make scheduling decisions in real-time (every second). As a result of its increased expressiveness, ERIS outperforms a less expressive schedule-driven approach and a fully-actuated control method in a variety of simulated traffic environments.

关键词

A* Search; Heuristic Search; Traffic Scheduling

全文

PDF


Planning and Learning for Decentralized MDPs With Event Driven Rewards

摘要

Decentralized (PO)MDPs provide a rigorous framework for sequential multiagent decision making under uncertainty. However, their high computational complexity limits the practical impact. To address scalability and real-world impact, we focus on settings where a large number of agents primarily interact through complex joint-rewards that depend on their entire histories of states and actions. Such history-based rewards encapsulate the notion of events or tasks such that the team reward is given only when the joint-task is completed. Algorithmically, we contribute---1) A nonlinear programming (NLP) formulation for such event-based planning model; 2) A probabilistic inference based approach that scales much better than NLP solvers for a large number of agents; 3) A policy gradient based multiagent reinforcement learning approach that scales well even for exponential state-spaces. Our inference and RL-based advances enable us to solve a large real-world multiagent coverage problem modeling schedule coordination of agents in a real urban subway network where other approaches fail to scale.

关键词

Multiagent Planning; Coordination and Collaboration; Probabilistic Planning; Sequential Decision Making

全文

PDF


A Recursive Algorithm to Generate Balanced Weekend Tournaments

摘要

In this paper, we construct a Balanced Weekend Tournament, motivated by the real-life problem of scheduling an n-team double round-robin season schedule for a Canadian university soccer league. In this 6-team league, games are only played on Saturdays and Sundays, with the condition that no team has two road games on any weekend. The implemented regular-season schedule for n = 6 was best-possible, but failed to meet an important "compactness" criterion, as the 10-game tournament required more than five weekends to complete. The motivation for this paper was to determine whether an optimal season schedule, satisfying all of the league's constraints on compact balanced play, could be constructed for sports leagues with n > 6 teams. We present a simple recursive algorithm to answer this question for all even n > 6. As a corollary, our construction gives us an explicit solution to a challenging and well-known graph theory question, namely the problem of decomposing the complete directed graph K*2m into 2m–1 directed Hamiltonian cycles of length 2m.

关键词

sports scheduling; tournament scheduling; hamiltonian cycles; hamiltonian decompositions

全文

PDF


Plan Recognition in Continuous Domains

摘要

Plan recognition is the task of inferring the plan of an agent, based on an incomplete sequence of its observed actions. Previous formulations of plan recognition commit early to discretizations of the environment and the observed agent's actions. This leads to reduced recognition accuracy. To address this, we first provide a formalization of recognition problems which admits continuous environments, as well as discrete domains. We then show that through mirroring---generalizing plan-recognition by planning---we can apply continuous-world motion planners in plan recognition. We provide formal arguments for the usefulness of mirroring, and empirically evaluate mirroring in more than a thousand recognition problems in three continuous domains and six classical planning domains.

关键词

plan recognition; goal recognition; motion planning; mirroring; mirror neuron system modeling

全文

PDF


Semi-Black Box: Rapid Development of Planning Based Solutions

摘要

Software developers nowadays not infrequently face a challenge of solving problems that essentially sum up to finding a sequence of deterministic actions leading from a given initial state to a goal. This is the problem of deterministic planning, one of the most basic and well studied problems in artificial intelligence. Two of the best known approaches to deterministic planning are the black box approach, in which a programmer implements a successor generator, and the model-based approach, in which a user describes the problem symbolically, e.g., in PDDL. While the black box approach is usually easier for programmers who are not experts in AI to understand, it does not scale up without informative heuristics. We propose an approach that we baptize as semi-black box (SBB) that combines the strength of both. SBB is implemented as a set of Java classes, which a programmer can inherit from when implementing a successor generator. Using the known characteristics of these classes, we then automatically derive heuristics for the problem. Our empirical evaluation shows that these heuristics allow the planner to scale up significantly better than the traditional black box approach.

关键词

Automated Planning; Modeling

全文

PDF


Multiagent Simple Temporal Problem: The Arc-Consistency Approach

摘要

The Simple Temporal Problem (STP) is a fundamental temporal reasoning problem and has recently been extended to the Multiagent Simple Temporal Problem (MaSTP). In this paper we present a novel approach that is based on enforcing arc-consistency (AC) on the input (multiagent) simple temporal network. We show that the AC-based approach is sufficient for solving both the STP and MaSTP and provide efficient algorithms for them. As our AC-based approach does not impose new constraints between agents, it does not violate the privacy of the agents and is superior to the state-of-the-art approach to MaSTP. Empirical evaluations on diverse benchmark datasets also show that our AC-based algorithms for STP and MaSTP are significantly more efficient than existing approaches.

关键词

Simple Temporal Network; Multiagent Simple Temporal Network; Arc-Consistency

全文

PDF


Load Scheduling of Simple Temporal Networks Under Dynamic Resource Pricing

摘要

We study load scheduling of simple temporal networks (STNs) under dynamic pricing of resources. We are given a set of processes and a set of simple temporal constraints between their execution times, i.e., an STN. Each process uses a certain amount of resource for execution. The unit price of the resource is a function of time, f(t). The goal is to find a schedule of a given STN that trades off makespan minimization against cost minimization within a user-specified suboptimality bound. We provide a polynomial-time algorithm for solving the load scheduling problem when f(t) is piecewise constant. This has important applications in many real-world domains including the smart home and smart grid domains. We then study the dependency of the unit price of the resource on time as well as the total demand at that time. This leads to a further characterization of tractable, NP-hard, and conjectured tractable cases.

关键词

Load scheduling; simple temporal networks; dynamic resource pricing; smart home

全文

PDF


On the Relationship Between State-Dependent Action Costs and Conditional Effects in Planning

摘要

When planning for tasks that feature both state-dependent action costs and conditional effects using relaxation heuristics, the following problem appears: handling costs and effects separately leads to worse-than-necessary heuristic values, since we may get the more useful effect at the lower cost by choosing different values of a relaxed variable when determining relaxed costs and relaxed active effects. In this paper, we show how this issue can be avoided by representing state-dependent costs and conditional effects uniformly, both as edge-valued multi-valued decision diagrams (EVMDDs) over different sets of edge values, and then working with their product diagram. We develop a theory of EVMDDs that is general enough to encompass state-dependent action costs, conditional effects, and even their combination.We define relaxed effect semantics in the presence of state-dependent action costs and conditional effects, and describe how this semantics can be efficiently computed using product EVMDDs. This will form the foundation for informative relaxation heuristics in the setting with state-dependent costs and conditional effects combined.

关键词

Artificial Intelligence, Planning

全文

PDF


Generalized Value Iteration Networks:Life Beyond Lattices

摘要

In this paper, we introduce a generalized value iteration network (GVIN), which is an end-to-end neural network planning module. GVIN emulates the value iteration algorithm by using a novel graph convolution operator, which enables GVIN to learn and plan on irregular spatial graphs. We propose three novel differentiable kernels as graph convolution operators and show that the embedding-based kernel achieves the best performance. Furthermore, we present episodic Q-learning, an improvement upon traditional n-step Q-learning that stabilizes training for VIN and GVIN. Lastly, we evaluate GVIN on planning problems in 2D mazes, irregular graphs, and real-world street networks, showing that GVIN generalizes well for both arbitrary graphs and unseen graphs of larger scaleand outperforms a naive generalization of VIN (discretizing a spatial graph into a 2D image).

关键词

value iteration network; irregular graph; graph convolution

全文

PDF


Linear and Integer Programming-Based Heuristics for Cost-Optimal Numeric Planning

摘要

Linear programming has been successfully used to compute admissible heuristics for cost-optimal classical planning. Although one of the strengths of linear programming is the ability to express and reason about numeric variables and constraints, their use in numeric planning is limited. In this work, we extend linear programming-based heuristics for classical planning to support numeric state variables. In particular, we propose a model for the interval relaxation, coupled with landmarks and state equation constraints. We consider both linear programming models and their harder-to-solve, yet more informative, integer programming versions. Our experimental analysis shows that considering an NP-Hard heuristic often pays off and that A* search using our integer programming heuristics establishes a new state of the art in cost-optimal numeric planning.

关键词

Cost-optimal Numeric Planning; Heuristic Search; Numeric Planning; Integer Programming; Linear Programming

全文

PDF


Sensor-Based Activity Recognition via Learning From Distributions

摘要

Sensor-based activity recognition aims to predict users' activities from multi-dimensional streams of various sensor readings received from ubiquitous sensors. To use machine learning techniques for sensor-based activity recognition, previous approaches focused on composing a feature vector to represent sensor-reading streams received within a period of various lengths. With the constructed feature vectors, e.g., using predefined orders of moments in statistics, and their corresponding labels of activities, standard classification algorithms can be applied to train a predictive model, which will be used to make predictions online. However, we argue that in this way some important information, e.g., statistical information captured by higher-order moments, may be discarded when constructing features. Therefore, in this paper, we propose a new method, denoted by SMMAR, based on learning from distributions for sensor-based activity recognition. Specifically, we consider sensor readings received within a period as a sample, which can be represented by a feature vector of infinite dimensions in a Reproducing Kernel Hilbert Space (RKHS) using kernel embedding techniques. We then train a classifier in the RKHS. To scale-up the proposed method, we further offer an accelerated version by utilizing an explicit feature map instead of using a kernel function. We conduct experiments on four benchmark datasets to verify the effectiveness and scalability of our proposed method.

关键词

全文

PDF


Knowledge-Based Policies for Qualitative Decentralized POMDPs

摘要

Qualitative Decentralized Partially Observable Markov Decision Problems (QDec-POMDPs) constitute a very general class of decision problems. They involve multiple agents, decentralized execution, sequential decision, partial observability, and uncertainty. Typically, joint policies, which prescribe to each agent an action to take depending on its full history of (local) actions and observations, are huge, which makes it difficult to store them onboard, at execution time, and also hampers the computation of joint plans. We propose and investigate a new representation for joint policies in QDec-POMDPs, which we call Multi-Agent Knowledge-Based Programs (MAKBPs), and which uses epistemic logic for compactly representing conditions on histories. Contrary to standard representations, executing an MAKBP requires reasoning at execution time, but we show that MAKBPs can be exponentially more succinct than any reactive representation.

关键词

全文

PDF


Risk-Aware Proactive Scheduling via Conditional Value-at-Risk

摘要

In this paper, we consider the challenging problem of riskaware proactive scheduling with the objective of minimizing robust makespan. State-of-the-art approaches based on probabilistic constrained optimization lead to Mixed Integer Linear Programs that must be heuristically approximated. We optimize the robust makespan via a coherent risk measure, Conditional Value-at-Risk (CVaR). Since traditional CVaR optimization approaches assuming linear spaces does not suit our problem, we propose a general branch-and-bound framework for combinatorial CVaR minimization. We then design an approximate complete algorithm, and employ resource reasoning to enable constraint propagation for multiple samples. Empirical results show that our algorithm outperforms state-of-the-art approaches with higher solution quality.

关键词

全文

PDF


Stackelberg Planning: Towards Effective Leader-Follower State Space Search

摘要

Inspired by work on Stackelberg security games, we introduce Stackelberg planning, where a leader player in a classical planning task chooses a minimum-cost action sequence aimed at maximizing the plan cost of a follower player in the same task. Such Stackelberg planning can provide useful analyses not only in planning-based security applications like network penetration testing, but also to measure robustness against perturbances in more traditional planning applications (e. g. with a leader sabotaging road network connections in transportation-type domains). To identify all equilibria---exhibiting the leader’s own-cost-vs.-follower-cost trade-off---we design leader-follower search, a state space search at the leader level which calls in each state an optimal planner at the follower level. We devise simple heuristic guidance, branch-and-bound style pruning, and partial-order reduction techniques for this setting. We run experiments on Stackelberg variants of IPC and pentesting benchmarks. In several domains, Stackelberg planning is quite feasible in practice.

关键词

Planning; Stackelberg; Security; Pentesting; Partial Order Reduction; IPC

全文

PDF


Action Schema Networks: Generalised Policies With Deep Learning

摘要

In this paper, we introduce the Action Schema Network (ASNet): a neural network architecture for learning generalised policies for probabilistic planning problems. By mimicking the relational structure of planning problems, ASNets are able to adopt a weight sharing scheme which allows the network to be applied to any problem from a given planning domain. This allows the cost of training the network to be amortised over all problems in that domain. Further, we propose a training method which balances exploration and supervised training on small problems to produce a policy which remains robust when evaluated on larger problems. In experiments, we show that ASNet's learning capability allows it to significantly outperform traditional non-learning planners in several challenging domains.

关键词

全文

PDF


Learning Conditional Generative Models for Temporal Point Processes

摘要

Estimating the future event sequence conditioned on current observations is a long-standing and challenging task in temporal analysis. On one hand for many real-world problems the underlying dynamics can be very complex and often unknown. This renders the traditional parametric point process models often fail to fit the data for their limited capacity. On the other hand, long-term prediction suffers from the problem of bias exposure where the error accumulates and propagates to future prediction. Our new model builds upon the sequence to sequence (seq2seq) prediction network. Compared with parametric point process models, its modeling capacity is higher and has better flexibility for fitting real-world data. The main novelty of the paper is to mitigate the second challenge by introducing the likelihood-free loss based on Wasserstein distance between point processes, besides negative maximum likelihood loss used in the traditional seq2seq model. Wasserstein distance, unlike KL divergence i.e. MLE loss, is sensitive to the underlying geometry between samples and can robustly enforce close geometry structure between them. This technique is proven able to improve the vanilla seq2seq model by a notable margin on various tasks.

关键词

Temporal Point Processes

全文

PDF


Combining Experts’ Causal Judgments

摘要

Consider a policymaker who wants to decide which intervention to perform in order to change a currently undesirable situation. The policymaker has at her disposal a team of experts, each with their own understanding of the causal dependencies between different factors contributing to the outcome. The policymaker has varying degrees of confidence in the experts’ opinions. She wants to combine their opinions in order to decide on the most effective intervention. We formally define the notion of an effective intervention, and then consider how experts’ causal judgments can be combined in order to determine the most effective intervention. We define a notion of two causal models being compatible, and show how compatible causal models can be combined. We then use it as the basis for combining experts causal judgments. We illustrate our approach on a number of real-life examples.

关键词

causality, intervention, combining causal judgments

全文

PDF


An Experimental Study of Advice in Sequential Decision-Making Under Uncertainty

摘要

We consider sequential decision making problems under uncertainty, in which a user has a general idea of the task to achieve, and gives advice to an agent in charge of computing an optimal policy. Many different notions of advice have been proposed in somewhat different settings, especially in the field of inverse reinforcement learning and for resolution of Markov Decision Problems with Imprecise Rewards. Two key questions are whether the advice required by a specific method is natural for the user to give, and how much advice is needed for the agent to compute a good policy, as evaluated by the user. We give a unified view of a number of proposals made in the literature, and propose a new notion of advice, which corresponds to a user telling why she would take a given action in a given state. For all these notions, we discuss their naturalness for a user and the integration of advice. We then report on an experimental study of the amount of advice needed for the agent to compute a good policy. Our study shows in particular that continual interaction between the user and the agent is worthwhile, and sheds light on the pros and cons of each type of advice.

关键词

全文

PDF


Optimal Approximation of Random Variables for Estimating the Probability of Meeting a Plan Deadline

摘要

In planning algorithms and in other domains, there is often a need to run long computations that involve summations, maximizations and other operations on random variables, and to store intermediate results. In this paper, as a main motivating example, we elaborate on the case of estimating probabilities of meeting deadlines in hierarchical plans. A source of computational complexity, often neglected in the analysis of such algorithms, is that the support of the variables needed as intermediate results may grow exponentially along the computation. Therefore, to avoid exponential memory and time complexities, we need to trim these variables. This is similar, in a sense, to rounding intermediate results in numerical computations. Of course, to maintain the quality of algorithms, the trimming procedure should be efficient and it must maintain accuracy as much as possible. In this paper, we propose an optimal trimming algorithm with polynomial time and memory complexities for the purpose of estimating probabilities of deadlines in plans. More specifically, we show that our algorithm, given the needed size of the representation of the variable, provides the best possible approximation, where approximation accuracy is considered with a measure that fits the goal of estimating deadline meeting probabilities.

关键词

deadline; approximation of random variables; Kolmogorov; decision making under uncertainty

全文

PDF


Generalized Adjustment Under Confounding and Selection Biases

摘要

Selection and confounding biases are the two most common impediments to the applicability of causal inference methods in large-scale settings. We generalize the notion of backdoor adjustment to account for both biases and leverage external data that may be available without selection bias (e.g., data from census). We introduce the notion of adjustment pair and present complete graphical conditions for identifying causal effects by adjustment. We further design an algorithm for listing all admissible adjustment pairs in polynomial delay, which is useful for researchers interested in evaluating certain properties of some admissible pairs but not all (common properties include cost, variance, and feasibility to measure). Finally, we describe a statistical estimation procedure that can be performed once a set is known to be admissible, which entails different challenges in terms of finite samples.

关键词

causality; adjustment; backdoor; identifiability

全文

PDF


Armstrong's Axioms and Navigation Strategies

摘要

The paper investigates navigability with imperfect information. It shows that the properties of navigability with perfect recall are exactly those captured by Armstrong's axioms from database theory. If the assumption of perfect recall is omitted, then Armstrong's transitivity axiom is not valid, but it can be replaced by a weaker principle. The main technical results are soundness and completeness theorems for the logical systems describing properties of navigability with and without perfect recall.

关键词

navigation; completeness; axiomatization; memoryless; perfect recall

全文

PDF


Lifted Generalized Dual Decomposition

摘要

Many real-world problems, such as Markov Logic Networks (MLNs) with evidence, can be represented as a highly symmetric graphical model perturbed by additional potentials. In these models, variational inference approaches that exploit exact model symmetries are often forced to ground the entire problem, while methods that exploit approximate symmetries (such as by constructing an over-symmetric approximate model) offer no guarantees on solution quality. In this paper, we present a method based on a lifted variant of the generalized dual decomposition (GenDD) for marginal MAP inference which provides a principled way to exploit symmetric sub-structures in a graphical model. We develop a coarse-to-fine inference procedure that provides any-time upper bounds on the objective. The upper bound property of GenDD provides a principled way to guide the refinement process, providing good any-time performance and eventually arriving at the ground optimal solution.

关键词

全文

PDF


Learning Mixtures of MLNs

摘要

Weight learning is a challenging problem in Markov Logic Networks (MLNs) due to the large size of the ground propositional probabilistic graphical model that underlies the first-order representation of MLNs. Though more sophisticated weight learning methods that use lifted inference have been proposed, such methods can typically scale up only in the absence of evidence, namely in generative weight learning. In discriminative learning, where the evidence typically destroys symmetries, existing approaches are lacking in scalability. In this paper, we propose a novel, intuitive approach for learning MLNs discriminatively by utilizing approximate symmetries. Specifically, we reduce the size of the training database by clustering approximately symmetric atoms together and selecting a representative atom from each cluster. However, each choice made from the clusters induces a different distribution, increasing the uncertainty in our learned model. To reduce this uncertainty, we learn a finite mixture model by stacking the different distributions, where the parameters of the model are learned using an EM approach. Our results on several benchmarks show that our approach is much more scalable and accurate as compared to existing state-of-the-art MLN learning methods.

关键词

Markov Logic Networks; Probabilistic Graphical Models; Approximate Learning

全文

PDF


RelNN: A Deep Neural Model for Relational Learning

摘要

Statistical relational AI (StarAI) aims at reasoning and learning in noisy domains described in terms of objects and relationships by combining probability with first-order logic. With huge advances in deep learning in the current years, combining deep networks with first-order logic has been the focus of several recent studies. Many of the existing attempts, however, only focus on relations and ignore object properties. The attempts that do consider object properties are limited in terms of modelling power or scalability. In this paper, we develop relational neural networks (RelNNs) by adding hidden layers to relational logistic regression (the relational counterpart of logistic regression). We learn latent properties for objects both directly and through general rules. Back-propagation is used for training these models. A modular, layer-wise architecture facilitates utilizing the techniques developed within deep learning community to our architecture. Initial experiments on eight tasks over three real-world datasets show that RelNNs are promising models for relational learning.

关键词

Artificial Intelligence; Machine Learning; Relational Learning; Deep Learning; Relational Logistic Regression; Markov Logic Networks; Symbolic Neural Models; Deep Relational Learning; Statistical Relational Learning; Statistical Relational AI

全文

PDF


Approximate Inference via Weighted Rademacher Complexity

摘要

Rademacher complexity is often used to characterize the learnability of a hypothesis class and is known to be related to the class size. We leverage this observation and introduce a new technique for estimating the size of an arbitrary weighted set, defined as the sum of weights of all elements in the set. Our technique provides upper and lower bounds on a novel generalization of Rademacher complexity to the weighted setting in terms of the weighted set size. This generalizes Massart’s Lemma, a known upper bound on the Rademacher complexity in terms of the unweighted set size. We show that the weighted Rademacher complexity can be estimated by solving a randomly perturbed optimization problem, allowing us to derive high probability bounds on the size of any weighted set. We apply our method to the problems of calculating the partition function of an Ising model and computing propositional model counts (#SAT). Our experiments demonstrate that we can produce tighter bounds than competing methods in both the weighted and unweighted settings.

关键词

inference; #SAT; Rademacher complexity

全文

PDF


Relational Marginal Problems: Theory and Estimation

摘要

In the propositional setting, the marginal problem is to find a (maximum-entropy) distribution that has some given marginals. We study this problem in a relational setting and make the following contributions. First, we compare two different notions of relational marginals. Second, we show a duality between the resulting relational marginal problems and the maximum likelihood estimation of the parameters of relational models, which generalizes a well-known duality from the propositional setting. Third, by exploiting the relational marginal formulation, we present a statistically sound method to learn the parameters of relational models that will be applied in settings where the number of constants differs between the training and test data. Furthermore, based on a relational generalization of marginal polytopes, we characterize cases where the standard estimators based on feature's number of true groundings needs to be adjusted and we quantitatively characterize the consequences of these adjustments. Fourth, we prove bounds on expected errors of the estimated parameters, which allows us to lower-bound, among other things, the effective sample size of relational training data.

关键词

artificial intelligence; relational learning;

全文

PDF


Anytime Anyspace AND/OR Best-First Search for Bounding Marginal MAP

摘要

Marginal MAP is a key task in Bayesian inference and decision-making. It is known to be very difficult in general, particularly because the evaluation of each MAP assignment requires solving an internal summation problem. In this paper, we propose a best-first search algorithm that provides anytime upper bounds for marginal MAP in graphical models. It folds the computation of external maximization and internal summation into an AND/OR tree search framework, and solves them simultaneously using a unified best-first search algorithm. The algorithm avoids some unnecessary computation of summation sub-problems associated with MAP assignments, and thus yields significant time savings. Furthermore, our algorithm is able to operate within limited memory. Empirical evaluation on three challenging benchmarks demonstrates that our unified best-first search algorithm using pre-compiled variational heuristics often provides tighter anytime upper bounds compared to those state-of-the-art baselines.

关键词

graphical models; marginal MAP; inference; search; bounds

全文

PDF


A Neural Stochastic Volatility Model

摘要

In this paper, we show that the recent integration of statistical models with deep recurrent neural networks provides a new way of formulating volatility (the degree of variation of time series) models that have been widely used in time series analysis and prediction in finance. The model comprises a pair of complementary stochastic recurrent neural networks: the generative network models the joint distribution of the stochastic volatility process; the inference network approximates the conditional distribution of the latent variables given the observables. Our focus here is on the formulation of temporal dynamics of volatility over time under a stochastic recurrent neural network framework. Experiments on real-world stock price datasets demonstrate that the proposed model generates a better volatility estimation and prediction that outperforms mainstream methods, e.g., deterministic models such as GARCH and its variants, and stochastic models namely the MCMC-based stochvol as well as the Gaussian-process-based, on average negative log-likelihood.

关键词

volatility modelling; variational inference; neural networks

全文

PDF


Learning Robust Options

摘要

Robust reinforcement learning aims to produce policies that have strong guarantees even in the face of environments/transition models whose parameters have strong uncertainty. Existing work uses value-based methods and the usual primitive action setting. In this paper, we propose robust methods for learning temporally abstract actions, in the framework of options. We present a Robust Options Policy Iteration (ROPI) algorithm with convergence guarantees, which learns options that are robust to model uncertainty. We utilize ROPI to learn robust options with the Robust Options Deep Q Network (RO-DQN) that solves multiple tasks and mitigates model misspecification due to model uncertainty. We present experimental results which suggest that policy iteration with linear features may have an inherent form of robustness when using coarse feature representations. In addition, we present experimental results which demonstrate that robustness helps policy iteration implemented on top of deep neural networks to generalize over a much broader range of dynamics than non-robust policy iteration.

关键词

Reinforcement Learning, Robustness

全文

PDF


Efficient-UCBV: An Almost Optimal Algorithm Using Variance Estimates

摘要

We propose a novel variant of the UCB algorithm (referred to as Efficient-UCB-Variance (EUCBV)) for minimizing cumulative regret in the stochastic multi-armed bandit (MAB) setting. EUCBV incorporates the arm elimination strategy proposed in UCB-Improved, while taking into account the variance estimates to compute the arms' confidence bounds, similar to UCBV. Through a theoretical analysis we establish that EUCBV incurs a gap-dependent regret bound which is an improvement over that of existing state-of-the-art UCB algorithms (such as UCB1, UCB-Improved, UCBV, MOSS). Further, EUCBV incurs a gap-independent regret bound which is an improvement over that of UCB1, UCBV and UCB-Improved, while being comparable with that of MOSS and OCUCB. Through an extensive numerical study we show that EUCBV significantly outperforms the popular UCB variants (like MOSS, OCUCB, etc.) as well as Thompson sampling and Bayes-UCB algorithms.

关键词

Bandits; UCB-Improved; UCBV; Regret

全文

PDF


Hawkes Process Inference With Missing Data

摘要

A multivariate Hawkes process is a class of marked point processes: A sample consists of a finite set of events of unbounded random size; each event has a real-valued time and a discrete-valued label (mark). It is self-excitatory: Each event causes an increase in the rate of other events (of either the same or a different label) in the (near) future. Prior work has developed methods for parameter estimation from complete samples. However, just as unobserved variables can increase the modeling power of other probabilistic models, allowing unobserved events can increase the modeling power of point processes. In this paper we develop a method to sample over the posterior distribution of unobserved events in a multivariate Hawkes process. We demonstrate the efficacy of our approach, and its utility in improving predictive power and identifying latent structure in real-world data.

关键词

point process; Hawkes process; MCMC inference

全文

PDF


Conditional PSDDs: Modeling and Learning With Modular Knowledge

摘要

Probabilistic Sentential Decision Diagrams (PSDDs) have been proposed for learning tractable probability distributions from a combination of data and background knowledge (in the form of Boolean constraints). In this paper, we propose a variant on PSDDs, called conditional PSDDs, for representing a family of distributions that are conditioned on the same set of variables. Conditional PSDDs can also be learned from a combination of data and (modular) background knowledge. We use conditional PSDDs to define a more structured version of Bayesian networks, in which nodes can have an exponential number of states, hence expanding the scope of domains where Bayesian networks can be applied. Compared to classical PSDDs, the new representation exploits the independencies captured by a Bayesian network to decompose the learning process into localized learning tasks, which enables the learning of better models while using less computation. We illustrate the promise of conditional PSDDs and structured Bayesian networks empirically, and by providing a case study to the modeling of distributions over routes on a map.

关键词

Bayesian Networks, Decision Diagrams, Learning, Constraints

全文

PDF


Information Acquisition Under Resource Limitations in a Noisy Environment

摘要

We introduce a theoretical model of information acquisition under resource limitations in a noisy environment. An agent must guess the truth value of a given Boolean formula φ after performing a bounded number of noisy tests of the truth values of variables in the formula. We observe that, in general, the problem of finding an optimal testing strategy for φ is hard, but we suggest a useful heuristic. The techniques we use also give insight into two apparently unrelated, but well-studied problems: (1) rational inattention (the optimal strategy may involve hardly ever testing variables that are clearly relevant to φ) and (2) what makes a formula hard to learn/remember.

关键词

rationality, inattention, noisy observations

全文

PDF


Risk-Sensitive Submodular Optimization

摘要

The conditional value at risk (CVaR) is a popular risk measure which enables risk-averse decision making under uncertainty. We consider maximizing the CVaR of a continuous submodular function, an extension of submodular set functions to a continuous domain. One example application is allocating a continuous amount of energy to each sensor in a network, with the goal of detecting intrusion or contamination. Previous work allows maximization of the CVaR of a linear or concave function. Continuous submodularity represents a natural set of nonconcave functions with diminishing returns, to which existing techniques do not apply. We give a (1 - 1/e)-approximation algorithm for maximizing the CVaR of a monotone continuous submodular function. This also yields an algorithm for submodular set functions which produces a distribution over feasible sets with guaranteed CVaR. Experimental results in two sensor placement domains confirm that our algorithm substantially outperforms competitive baselines.

关键词

Submodularity; risk aversion; CVaR

全文

PDF


Towards Training Probabilistic Topic Models on Neuromorphic Multi-Chip Systems

摘要

Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs), which are flexible in programming but energy-consuming. Towards low-energy implementations, this paper investigates their training on an emerging hardware technology called the neuromorphic multi-chip systems (NMSs). NMSs are very effective for a family of algorithms called spiking neural networks (SNNs). We present three SNNs to train topic models.The first SNN is a batch algorithm combining the conventional collapsed Gibbs sampling (CGS) algorithm and an inference SNN to train LDA. The other two SNNs are online algorithms targeting at both energy- and storage-limited environments. The two online algorithms are equivalent with training LDA by using maximum-a-posterior estimation and maximizing the semi-collapsed likelihood, respectively.They use novel, tailored ordinary differential equations for stochastic optimization. We simulate the new algorithms and show that they are comparable with the GPC algorithms, while being suitable for NMS implementation. We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.

关键词

probabilistic topic model; neuromorphic chip; spiking neural network

全文

PDF


IONet: Learning to Cure the Curse of Drift in Inertial Odometry

摘要

Inertial sensors play a pivotal role in indoor localization, which in turn lays the foundation for pervasive personal applications. However, low-cost inertial sensors, as commonly found in smartphones, are plagued by bias and noise, which leads to unbounded growth in error when accelerations are double integrated to obtain displacement. Small errors in state estimation propagate to make odometry virtually unusable in a matter of seconds. We propose to break the cycle of continuous integration, and instead segment inertial data into independent windows. The challenge becomes estimating the latent states of each window, such as velocity and orientation, as these are not directly observable from sensor data. We demonstrate how to formulate this as an optimization problem, and show how deep recurrent neural networks can yield highly accurate trajectories, outperforming state-of-the-art shallow techniques, on a wide range of tests and attachments. In particular, we demonstrate that IONet can generalize to estimate odometry for non-periodic motion, such as a shopping trolley or baby-stroller, an extremely challenging task for existing techniques.

关键词

Indoor Localization; Inertial navigation; Deep Neural Networks

全文

PDF


Improved Results for Minimum Constraint Removal

摘要

Given a set of obstacles and two designated points in the plane, the Minimum Constraint Removal problem asks for a minimum number of obstacles that can be removed so that a collision-free path exists between the two designated points. It is a well-studied problem in both robotic motion planning and wireless computing that has been shown to be NP-hard in various settings. In this work, we extend the study of Minimum Constraint Removal. We start by presenting refined NP-hardness reductions for the two cases: (1) when all the obstacles are axes-parallel rectangles, and (2) when all the obstacles are line segments such that no three intersect at the same point. These results improve on existing results in the literature. As a byproduct of our NP-hardness reductions, we prove that, unless the Exponential-Time Hypothesis (ETH) fails, Minimum Constraint Removal cannot be solved in subexponential time 2o(n), where n is the number of obstacles in the instance. This shows that significant improvement on the brute-force 2O(n)-time algorithm is unlikely. We then present a subexponential-time algorithm for instances of Minimum Constraint Removal in which the number of obstacles that overlap at any point is constant; the algorithm runs in time 2O(√N), where N is the number of the vertices in the auxiliary graph associated with the instance of the problem. We show that significant improvement on this algorithm is unlikely by showing that, unless ETH fails, Minimum Constraint Removal with bounded overlap number cannot be solved in time 2o(√N). We describe several exact algorithms and approximation algorithms that leverage heuristics and discuss their performance in an extensive empirical simulation.

关键词

minimum constraint removal; barrier coverage; NP-hardness; exact algorithms

全文

PDF


Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning

摘要

Formal verification provides a high degree of confidence in safe system operation, but only if reality matches the verified model. Although a good model will be accurate most of the time, even the best models are incomplete. This is especially true in Cyber-Physical Systems because high-fidelity physical models of systems are expensive to develop and often intractable to verify. Conversely, reinforcement learning-based controllers are lauded for their flexibility in unmodeled environments, but do not provide guarantees of safe operation. This paper presents an approach for provably safe learning that provides the best of both worlds: the exploration and optimization capabilities of learning along with the safety guarantees of formal verification. Our main insight is that formal verification combined with verified runtime monitoring can ensure the safety of a learning agent. Verification results are preserved whenever learning agents limit exploration within the confounds of verified control choices as long as observed reality comports with the model used for off-line verification. When a model violation is detected, the agent abandons efficiency and instead attempts to learn a control strategy that guides the agent to a modeled portion of the state space. We prove that our approach toward incorporating knowledge about safe control into learning systems preserves safety guarantees, and demonstrate that we retain the empirical performance benefits provided by reinforcement learning. We also explore various points in the design space for these justified speculative controllers in a simple model of adaptive cruise control model for autonomous cars.

关键词

Formal Methods; Software Verification; Safe Reinforcement Learning; AI Safety; Hybrid Systems; Cyber-Physical Systems

全文

PDF


Iterative Continuous Convolution for 3D Template Matching and Global Localization

摘要

This paper introduces a novel methodology for 3D template matching that is scalable to higher-dimensional spaces and larger kernel sizes. It uses the Hilbert Maps framework to model raw pointcloud information as a continuous occupancy function, and we derive a closed-form solution to the convolution operation that takes place directly in the Reproducing Kernel Hilbert Space defining these functions. The result is a third function modeling activation values, that can be queried at arbitrary resolutions with logarithmic complexity, and by iteratively searching for high similarity areas we can determine matching candidates. Experimental results show substantial speed gains over standard discrete convolution techniques, such as sliding window and fast Fourier transform, along with a significant decrease in memory requirements, without accuracy loss. This efficiency allows the proposed methodology to be used in areas where discrete convolution is currently infeasible. As a practical example we explore the key problem in robotics of global localization, in which a vehicle must be positioned on a map using only its current sensor information, and provide comparisons with other state-of-the-art techniques in terms of computational speed and accuracy.

关键词

Mapping; Localization; Convolution; Correlation; Global Optimization; Template Matching; Hilbert Maps

全文

PDF


Learning Integrated Holism-Landmark Representations for Long-Term Loop Closure Detection

摘要

Loop closure detection is a critical component of large-scale simultaneous localization and mapping (SLAM) in loopy environments. This capability is challenging to achieve in long-term SLAM, when the environment appearance exhibits significant long-term variations across various time of the day, months, and even seasons. In this paper, we introduce a novel formulation to learn an integrated long-term representation based upon both holistic and landmark information, which integrates two previous insights under a unified framework: (1) holistic representations outperform keypoint-based representations, and (2) landmarks as an intermediate representation provide informative cues to detect challenging locations. Our new approach learns the representation by projecting input visual data into a low-dimensional space, which preserves both the global consistency (to minimize representation error) and the local consistency (to preserve landmarks’ pairwise relationship) of the input data. To solve the formulated optimization problem, a new algorithm is developed with theoretically guaranteed convergence. Extensive experiments have been conducted using two large-scale public benchmark data sets, in which the promising performances have demonstrated the effectiveness of the proposed approach.

关键词

Long-Term; Localization; Mapping; Navigation

全文

PDF


Guiding Search in Continuous State-Action Spaces by Learning an Action Sampler From Off-Target Search Experience

摘要

In robotics, it is essential to be able to plan efficiently in high-dimensional continuous state-action spaces for long horizons. For such complex planning problems, unguided uniform sampling of actions until a path to a goal is found is hopelessly inefficient, and gradient-based approaches often fall short when the optimization manifold of a given problem is not smooth. In this paper, we present an approach that guides search in continuous spaces for generic planners by learning an action sampler from past search experience. We use a Generative Adversarial Network (GAN) to represent an action sampler, and address an important issue: search experience consists of a relatively large number of actions that are not on a solution path and a relatively small number of actions that actually are on a solution path. We introduce a new technique, based on an importance-ratio estimation method, for using samples from a non-target distribution to make GAN learning more data-efficient. We provide theoretical guarantees and empirical evaluation in three challenging continuous robot planning problems to illustrate the effectiveness of our algorithm.

关键词

全文

PDF


Unsupervised Selection of Negative Examples for Grounded Language Learning

摘要

There has been substantial work in recent years on grounded language acquisition, in which language and sensor data are used to create a model relating linguistic constructs to the perceivable world. While powerful, this approach is frequently hindered by ambiguities, redundancies, and omissions found in natural language. We describe an unsupervised system that learns language by training visual classifiers, first selecting important terms from object descriptions, then automatically choosing negative examples from a paired corpus of perceptual and linguistic data. We evaluate the effectiveness of each stage as well as the system's performance on the overall learning task.

关键词

robotics,human-robot-interaction,natural-language-processing

全文

PDF


From Virtual Demonstration to Real-World Manipulation Using LSTM and MDN

摘要

Robots assisting the disabled or elderly must perform complex manipulation tasks and must adapt to the home environment and preferences of their user. Learning from demonstration is a promising choice, that would allow the non-technical user to teach the robot different tasks. However, collecting demonstrations in the home environment of a disabled user is time consuming, disruptive to the comfort of the user, and presents safety challenges. It would be desirable to perform the demonstrations in a virtual environment. In this paper we describe a solution to the challenging problem of behavior transfer from virtual demonstration to a physical robot. The virtual demonstrations are used to train a deep neural network based controller, which is using a Long Short Term Memory (LSTM) recurrent neural network to generate trajectories. The training process uses a Mixture Density Network (MDN) to calculate an error signal suitable for the multimodal nature of demonstrations. The controller learned in the virtual environment is transferred to a physical robot (a Rethink Robotics Baxter). An off-the-shelf vision component is used to substitute for geometric knowledge available in the simulation and an inverse kinematics module is used to allow the Baxter to enact the trajectory. Our experimental studies validate the three contributions of the paper: (1) the controller learned from virtual demonstrations can be used to successfully perform the manipulation tasks on a physical robot, (2) the LSTM+MDN architectural choice outperforms other choices, such as the use of feedforward networks and mean-squared error based training signals and (3) allowing imperfect demonstrations in the training set also allows the controller to learn how to correct its manipulation mistakes.

关键词

Robot learning; Learning from demonstration; Deep neural networks

全文

PDF


Building Continuous Occupancy Maps With Moving Robots

摘要

Mapping the occupancy level of an environment is important for a robot to navigate in unknown and unstructured environments. To this end, continuous occupancy mapping techniques which express the probability of a location as a function are used. In this work, we provide a theoretical analysis to compare and contrast the two major branches of Bayesian continuous occupancy mapping techniques---Gaussian process occupancy maps and Bayesian Hilbert maps---considering the fact that both utilize kernel functions to operate in a rich high-dimensional implicit feature space and use variational inference to learn parameters. Then, we extend the recent Bayesian Hilbert maps framework which is so far only used for stationary robots, to map large environments with moving robots. Finally, we propose convolution of kernels as a powerful tool to improve different aspects of continuous occupancy mapping. Our claims are also experimentally validated with both simulated and real-world datasets.

关键词

Occupancy map; SLAM; robot learning; kernels

全文

PDF


Phase-Parametric Policies for Reinforcement Learning in Cyclic Environments

摘要

In many reinforcement learning problems, parameters of the model may vary with its phase while the agent attempts to learn through its interaction with the environment. For example, an autonomous car's reward on selecting a path may depend on traffic conditions at the time of the day or the transition dynamics of a drone may depend on the current wind direction. Many such processes exhibit a cyclic phase-structure and could be represented with a control policy parameterized over a circular or cyclic phase space. Attempting to model such phase variations with a standard data-driven approach (e.g. deep networks) without explicitly modeling the phase of the model can be challenging. Ambiguities may arise as the optimal action for a given state can vary depending on the phase. To better model cyclic environments, we propose phase-parameterized policies and value function approximators that explicitly enforce a cyclic structure to the policy or value space. We apply our phase-parameterized reinforcement learning approach to both feed-forward and recurrent deep networks in the context of trajectory optimization and locomotion problems. Our experiments show that our proposed approach has superior modeling performance than traditional function approximators in cyclic environments.

关键词

Reinforcement Learning; Deep Learning

全文

PDF


Safe Exploration and Optimization of Constrained MDPs Using Gaussian Processes

摘要

We present a reinforcement learning approach to explore and optimize a safety-constrained Markov Decision Process(MDP). In this setting, the agent must maximize discounted cumulative reward while constraining the probability of entering unsafe states, defined using a safety function being within some tolerance. The safety values of all states are not known a priori, and we probabilistically model them via aGaussian Process (GP) prior. As such, properly behaving in such an environment requires balancing a three-way trade-off of exploring the safety function, exploring the reward function, and exploiting acquired knowledge to maximize reward. We propose a novel approach to balance this trade-off. Specifically, our approach explores unvisited states selectively; that is, it prioritizes the exploration of a state if visiting that state significantly improves the knowledge on the achievable cumulative reward. Our approach relies on a novel information gain criterion based on Gaussian Process representations of the reward and safety functions. We demonstrate the effectiveness of our approach on a range of experiments, including a simulation using the real Martian terrain data.

关键词

Markov Decision Process, Gaussian Processes

全文

PDF


Sweep-Based Propagation for String Constraint Solving

摘要

Solving constraints over strings is an emerging important field. Recently, a Constraint Programming approach based on dashed strings has been proposed to enable a compact domain representation for potentially large bounded-length string variables. In this paper, we present a more efficient algorithm for propagating equality (and related constraints) over dashed strings. We call this propagation sweep-based. Experimental evidences show that sweep-based propagation is able to significantly outperform state-of-the-art approaches for string constraint solving.

关键词

Artificial Intelligence; Constraint Programming; String Solving

全文

PDF


MaxSAT Resolution With the Dual Rail Encoding

摘要

Conflict-driven clause learning (CDCL) is at the core of the success of modern SAT solvers. In terms of propositional proof complexity, CDCL has been shown as strong as general resolution. Improvements to SAT solvers can be realized either by improving existing algorithms, or by exploiting proof systems stronger than CDCL. Recent work proposed an approach for solving SAT by reduction to Horn MaxSAT. The proposed reduction coupled with MaxSAT resolution represents a new proof system, DRMaxSAT, which was shown to enable polynomial time refutations of pigeonhole formulas, in contrast with either CDCL or general resolution. This paper investigates the DRMaxSAT proof system, and shows that DRMaxSAT p-simulates general resolution, that AC0-Frege+PHP p-simulates DRMaxSAT, and that DRMaxSAT can not p-simulate AC0-Frege+PHP or the cutting planes proof system.

关键词

Proof complexity; Proof system; Satisfiability; MaxSAT resolution; Dual rail encoding

全文

PDF


A SAT+CAS Method for Enumerating Williamson Matrices of Even Order

摘要

We present for the first time an exhaustive enumeration of Williamson matrices of even order n < 65. The search method relies on the novel SAT+CAS paradigm of coupling SAT solvers with computer algebra systems so as to take advantage of the advances made in both the field of satisfiability checking and the field of symbolic computation. Additionally, we use a programmatic SAT solver which allows conflict clauses to be learned programmatically, through a piece of code specifically tailored to the domain area. Prior to our work, Williamson matrices had only been enumerated for odd orders n < 60, so our work increases the bounds that Williamson matrices have been enumerated up to and provides the first enumeration of Williamson matrices of even order. Our results show that Williamson matrices of even order tend to be much more abundant than those of odd orders. In particular, Williamson matrices exist for every even order n < 65 but do not exist in orders 35, 47, 53, and 59.

关键词

satisfiability checking; symbolic computation; Williamson matrices

全文

PDF


Exact MAP-Inference by Confining Combinatorial Search With LP Relaxation

摘要

We consider the MAP-inference problem for graphical models, which is a valued constraint satisfaction problem defined on real numbers with a natural summation operation. We propose a family of relaxations (different from the famous Sherali-Adams hierarchy), which naturally define lower bounds for its optimum. This family always contains a tight relaxation and we give an algorithm able to find it and therefore, solve the initial non-relaxed NP-hard problem. The relaxations we consider decompose the original problem into two non-overlapping parts: an easy LP-tight part and a difficult one. For the latter part a combinatorial solver must be used. As we show in our experiments, in a number of applications the second, difficult part constitutes only a small fraction of the whole problem. This property allows to significantly reduce the computational time of the combinatorial solver and therefore solve problems which were out of reach before.

关键词

MAP-inference; energy minimization; graphical models; relaxation

全文

PDF


Community-Based Trip Sharing for Urban Commuting

摘要

This paper explores Community-Based Trip Sharing which uses the structure of communities and commuting patterns to optimize car or ride sharing for urban communities. It introduces the Commuting Trip Sharing Problem (CTSP) and proposes an optimization approach to maximize trip sharing. The optimization method, which exploits trip clustering, shareability graphs, and mixed-integer programming, is applied to a dataset of 9000 daily commuting trips from a mid-size city. Experimental results show that community-based trip sharing reduces daily car usage by up to 44%, thus producing significant environmental and traffic benefits and reducing parking pressure. The results also indicate that daily flexibility in pairing cars and passengers has significant impact on the benefits of the approach, revealing new insights on commuting patterns and trip sharing.

关键词

全文

PDF


Schur Number Five

摘要

We present the solution of a century-old problem known as Schur Number Five: What is the largest (natural) number n such that there exists a five-coloring of the positive numbers up to n without a monochromatic solution of the equation a + b = c? We obtained the solution, n = 160, by encoding the problem into propositional logic and applying massively parallel satisfiability solving techniques on the resulting formula. We also constructed and validated a proof of the solution to increase trust in the correctness of the multi-CPU-year computations. The proof is two petabytes in size and was certified using a formally verified proof checker, demonstrating that any result by satisfiability solvers---no matter how large---can now be validated using highly trustworthy systems.

关键词

Ramsey Theory; parallel SAT solving; proof checking

全文

PDF


Towards Generalization in QBF Solving via Machine Learning

摘要

There are well known cases of Quantified Boolean Formulas (QBFs) that have short winning strategies (Skolem/Herbrand functions) but that are hard to solve by nowadays solvers. This paper argues that a solver benefits from generalizing a set of individual wins into a strategy. This idea is realized on top of the competitive RAReQS algorithm by utilizing machine learning, which enables learning shorter strategies. The implemented prototype QFUN has won the first place in the non-CNF track of the most recent QBF competition.

关键词

QBF; SAT; quantification; machine learning

全文

PDF


Verifying Properties of Binarized Deep Neural Networks

摘要

Understanding properties of deep neural networks is an important challenge in deep learning. In this paper, we take a step in this direction by proposing a rigorous way of verifying properties of a popular class of neural networks, Binarized Neural Networks, using the well-developed means of Boolean satisfiability. Our main contribution is a construction that creates a representation of a binarized neural network as a Boolean formula. Our encoding is the first exact Boolean representation of a deep neural network. Using this encoding, we leverage the power of modern SAT solvers along with a proposed counterexample-guided search procedure to verify various properties of these networks. A particular focus will be on the critical property of robustness to adversarial perturbations. For this property, our experimental results demonstrate that our approach scales to medium-size deep neural networks used in image classification tasks. To the best of our knowledge, this is the first work on verifying properties of deep neural networks using an exact Boolean encoding of the network.

关键词

binarized deep neural networks; verification; SAT; counterexample-guided search

全文

PDF


Parallel Algorithms for Operations on Multi-Valued Decision Diagrams

摘要

Multi-valued Decision Diagrams (MDDs) have been extensively studied in the last ten years. Recently, efficient algorithms implementing operators such as reduction, union, intersection, difference, etc., have been designed. They directly deal with the graph structure of the MDD and a time reduction of several orders of magnitude in comparison to other existing algorithms have been observed. These operators have permitted a new look at MDDs, because extremely large MDDs can finally be manipulated as shown by the models used to solve complex application in music generation. However, MDDs become so large (50GB) that minutes are sometimes required to perform some operations. In order to accelerate the manipulation of MDDs, parallel algorithms are required. In this paper, we introduce such algorithms. We carefully design them in order to overcome inherent difficulties of the parallelization of sequential algorithms such as data dependencies, software lock-out, false sharing, or load balancing. As a result, we observe a speed-up , i.e. ratio between parallel and sequential runtimes, growing linearly with the number of cores.

关键词

constraint programming; CSP; MDD; parallel algorithm

全文

PDF


Premise Set Caching for Enumerating Minimal Correction Subsets

摘要

Methods for explaining the sources of inconsistency of overconstrained systems find an ever-increasing number of applications, ranging from diagnosis and configuration to ontology debugging and axiom pinpointing in description logics. Efficient enumeration of minimal correction subsets (MCSes), defined as sets of constraints whose removal from the system restores feasibility, is a central task in such domains. In this work, we propose a novel approach to speeding up MCS enumeration over conjunctive normal form propositional formulas by caching of so-called premise sets (PSes) seen during the enumeration process. Contrasting to earlier work, we move from caching unsatisfiable cores to caching PSes and propose a more effective way of implementing the cache. The proposed techniques noticeably improves on the performance of state-of-the-art MCS enumeration algorithms in practice.

关键词

inconsistency analysis, minimal corrections subsets, enumeration, caching

全文

PDF


On Cryptographic Attacks Using Backdoors for SAT

摘要

Propositional satisfiability (SAT) is at the nucleus of state-of-the-art approaches to a variety of computationally hard problems, one of which is cryptanalysis. Moreover, a number of practical applications of SAT can only be tackled efficiently by identifying and exploiting a subset of formula's variables called backdoor set (or simply backdoors). This paper proposes a new class of backdoor sets for SAT used in the context of cryptographic attacks, namely guess-and-determine attacks. The idea is to identify the best set of backdoor variables subject to a statistically estimated hardness of the guess-and-determine attack using a SAT solver. Experimental results on weakened variants of the renowned encryption algorithms exhibit advantage of the proposed approach compared to the state of the art in terms of the estimated hardness of the resulting guess-and-determine attacks.

关键词

Satisfiability; Backdoors; Cryptanalysis; Guess-and-determine attack

全文

PDF


Enhancing Constraint-Based Multi-Objective Combinatorial Optimization

摘要

Minimal Correction Subsets (MCSs) have been successfully applied to find approximate solutions to several real-world single-objective optimization problems. However, only recently have MCSs been used to solve Multi-Objective Combinatorial Optimization (MOCO) problems. In particular, it has been shown that all optimal solutions of MOCO problems with linear objective functions can be found by an MCS enumeration procedure. In this paper, we show that the approach of MCS enumeration can also be applied to MOCO problems where objective functions are divisions of linear expressions. Hence, it is not necessary to use a linear approximation of these objective functions. Additionally, we also propose the integration of diversification techniques on the MCS enumeration process in order to find better approximations of the Pareto front of MOCO problems. Finally, experimental results on the Virtual Machine Consolidation (VMC) problem show the effectiveness of the proposed techniques.

关键词

Multi-Objective Combinatorial Optimization; Minimal Correction Subsets

全文

PDF


Learning Robust Search Strategies Using a Bandit-Based Approach

摘要

Effective solving of constraint problems often requires choosing good or specific search heuristics. However, choosing or designing a good search heuristic is non-trivial and is often a manual process. In this paper, rather than manually choosing/designing search heuristics, we propose the use of bandit-based learning techniques to automatically select search heuristics. Our approach is online where the solver learns and selects from a set of heuristics during search. The goal is to obtain automatic search heuristics which give robust performance. Preliminary experiments show that our adaptive technique is more robust than the original search heuristics. It can also outperform the original heuristics.

关键词

constraint satisfaction; CSP; search heuristic; multi-armed bandit, online learning

全文

PDF


Learning Spatio-Temporal Features With Partial Expression Sequences for On-the-Fly Prediction

摘要

Spatio-temporal feature encoding is essential for encoding facial expression dynamics in video sequences. At test time, most spatio-temporal encoding methods assume that a temporally segmented sequence is fed to a learned model, which could require the prediction to wait until the full sequence is available to an auxiliary task that performs the temporal segmentation. This causes a delay in predicting the expression. In an interactive setting, such as affective interactive agents, such delay in the prediction could not be tolerated. Therefore, training a model that can accurately predict the facial expression "on-the-fly" (as they are fed to the system) is essential. In this paper, we propose a new spatio-temporal feature learning method, which would allow prediction with partial sequences. As such, the prediction could be performed on-the-fly. The proposed method utilizes an estimated expression intensity to generate dense labels, which are used to regulate the prediction model training with a novel objective function. As results, the learned spatio-temporal features can robustly predict the expression with partial (incomplete) expression sequences, on-the-fly. Experimental results showed that the proposed method achieved higher recognition rates compared to the state-of-the-art methods on both datasets. More importantly, the results verified that the proposed method improved the prediction frames with partial expression sequence inputs.

关键词

On the Fly prediction, LSTM, Facial Expression Recognition, Deep Learning

全文

PDF


SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

摘要

Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task. In recent years several new systems that try to solve at least one of the two sub-tasks (text detection and text recognition) have been proposed. In this paper we present SEE, a step towards semi-supervised neural networks for scene text detection and recognition, that can be optimized end-to-end. Most existing works consist of multiple deep neural networks and several pre-processing steps. In contrast to this, we propose to use a single deep neural network, that learns to detect and recognize text from natural images, in a semi-supervised way. SEE is a network that integrates and jointly learns a spatial transformer network, which can learn to detect text regions in an image, and a text recognition network that takes the identified text regions and recognizes their textual content. We introduce the idea behind our novel approach and show its feasibility, by performing a range of experiments on standard benchmark datasets, where we achieve competitive results.

关键词

Computer Vision; Semi-Supervised Learning

全文

PDF


Asymmetric Joint Learning for Heterogeneous Face Recognition

摘要

Heterogeneous face recognition (HFR) refers to matching a probe face image taken from one modality to face images acquired from another modality. It plays an important role in security scenarios. However, HFR is still a challenging problem due to great discrepancies between cross-modality images. This paper proposes an asymmetric joint learning (AJL) approach to handle this issue. The proposed method transforms the cross-modality differences mutually by incorporating the synthesized images into the learning process which provides more discriminative information. Although the aggregated data would augment the scale of intra-classes, it also reduces the diversity (i.e. discriminative information) for inter-classes. Then, we develop the AJL model to balance this dilemma. Finally, we could obtain the similarity score between two heterogeneous face images through the log-likelihood ratio. Extensive experiments on viewed sketch database, forensic sketch database and near infrared image database illustrate that the proposed AJL-HFR method achieve superior performance in comparison to state-of-the-art methods.

关键词

全文

PDF


Lateral Inhibition-Inspired Convolutional Neural Network for Visual Attention and Saliency Detection

摘要

Lateral inhibition in top-down feedback is widely existing in visual neurobiology, but such an important mechanism has not be well explored yet in computer vision. In our recent research, we find that modeling lateral inhibition in convolutional neural network (LICNN) is very useful for visual attention and saliency detection. In this paper, we propose to formulate lateral inhibition inspired by the related studies from neurobiology, and embed it into the top-down gradient computation of a general CNN for classification, i.e. only category-level information is used. After this operation (only conducted once), the network has the ability to generate accurate category-specific attention maps. Further, we apply LICNN for weakly-supervised salient object detection.Extensive experimental studies on a set of databases, e.g., ECSSD, HKU-IS, PASCAL-S and DUT-OMRON, demonstrate the great advantage of LICNN which achieves the state-of-the-art performance. It is especially impressive that LICNN with only category-level supervised information even outperforms some recent methods with segmentation-level supervised learning.

关键词

lateral inhibition, convolutional neural network, top-down attention, saliency detection

全文

PDF


Transfer Adversarial Hashing for Hamming Space Retrieval

摘要

Hashing is widely applied to large-scale image retrieval due to the storage and retrieval efficiency. Existing work on deep hashing assumes that the database in the target domain is identically distributed with the training set in the source domain. This paper relaxes this assumption to a transfer retrieval setting, which allows the database and the training set to come from different but relevant domains. However, the transfer retrieval setting will introduce two technical difficulties: first, the hash model trained on the source domain cannot work well on the target domain due to the large distribution gap; second, the domain gap makes it difficult to concentrate the database points to be within a small Hamming ball. As a consequence, transfer retrieval performance within Hamming Radius 2 degrades significantly in existing hashing methods. This paper presents Transfer Adversarial Hashing (TAH), a new hybrid deep architecture that incorporates a pairwise t-distribution cross-entropy loss to learn concentrated hash codes and an adversarial network to align the data distributions between the source and target domains. TAH can generate compact transfer hash codes for efficient image retrieval on both source and target domains. Comprehensive experiments validate that TAH yields state of the art Hamming space retrieval performance on standard datasets.

关键词

Image retrieval; Hashing

全文

PDF


Temporal-Difference Learning With Sampling Baseline for Image Captioning

摘要

The existing methods for image captioning usually train the language model under the cross entropy loss, which results in the exposure bias and inconsistency of evaluation metric. Recent research has shown these two issues can be well addressed by policy gradient method in reinforcement learning domain attributable to its unique capability of directly optimizing the discrete and non-differentiable evaluation metric. In this paper, we utilize reinforcement learning method to train the image captioning model. Specifically, we train our image captioning model to maximize the overall reward of the sentences by adopting the temporal-difference (TD) learning method, which takes the correlation between temporally successive actions into account. In this way, we assign different values to different words in one sampled sentence by a discounted coefficient when back-propagating the gradient with the REINFORCE algorithm, enabling the correlation between actions to be learned. Besides, instead of estimating a "baseline" to normalize the rewards with another network, we utilize the reward of another Monte-Carlo sample as the "baseline" to avoid high variance. We show that our proposed method can improve the quality of generated captions and outperforms the state-of-the-art methods on the benchmark dataset MS COCO in terms of seven evaluation metrics.

关键词

Image captioning; Reinforcement learning; LSTM

全文

PDF


Order-Free RNN With Visual Attention for Multi-Label Classification

摘要

We propose a recurrent neural network (RNN) based model for image multi-label classification. Our model uniquely integrates and learning of visual attention and Long Short Term Memory (LSTM) layers, which jointly learns the labels of interest and their co-occurrences, while the associated image regions are visually attended. Different from existing approaches utilize either model in their network architectures, training of our model does not require pre-defined label orders. Moreover, a robust inference process is introduced so that prediction errors would not propagate and thus affect the performance. Our experiments on NUS-WISE and MS-COCO datasets confirm the design of our network and its effectiveness in solving multi-label classification problems.

关键词

multi-label

全文

PDF


Learning a Wavelet-Like Auto-Encoder to Accelerate Deep Neural Networks

摘要

Accelerating deep neural networks (DNNs) has been attracting increasing attention as it can benefit a wide range of applications, e.g., enabling mobile systems with limited computing resources to own powerful visual recognition ability. A practical strategy to this goal usually relies on a two-stage process: operating on the trained DNNs (e.g., approximating the convolutional filters with tensor decomposition) and fine-tuning the amended network, leading to difficulty in balancing the trade-off between acceleration and maintaining recognition performance. In this work, aiming at a general and comprehensive way for neural network acceleration, we develop a Wavelet-like Auto-Encoder (WAE) that decomposes the original input image into two low-resolution channels (sub-images) and incorporate the WAE into the classification neural networks for joint training. The two decomposed channels, in particular, are encoded to carry the low-frequency information (e.g., image profiles) and high-frequency (e.g., image details or noises), respectively, and enable reconstructing the original input image through the decoding process. Then, we feed the low-frequency channel into a standard classification network such as VGG or ResNet and employ a very lightweight network to fuse with the high-frequency channel to obtain the classification result. Compared to existing DNN acceleration solutions, our framework has the following advantages: i) it is tolerant to any existing convolutional neural networks for classification without amending their structures; ii) the WAE provides an interpretable way to preserve the main components of the input image for classification.

关键词

Network Acceleration, Image Classification, Image decompostion

全文

PDF


Recurrent Attentional Reinforcement Learning for Multi-Label Image Recognition

摘要

Recognizing multiple labels of images is a fundamental but challenging task in computer vision, and remarkable progress has been attained by localizing semantic-aware image regions and predicting their labels with deep convolutional neural networks. The step of hypothesis regions (region proposals) localization in these existing multi-label image recognition pipelines, however, usually takes redundant computation cost, e.g., generating hundreds of meaningless proposals with non-discriminative information and extracting their features, and the spatial contextual dependency modeling among the localized regions are often ignored or over-simplified. To resolve these issues, this paper proposes a recurrent attention reinforcement learning framework to iteratively discover a sequence of attentional and informative regions that are related to different semantic objects and further predict label scores conditioned on these regions. Besides, our method explicitly models long-term dependencies among these attentional regions that help to capture semantic label co-occurrence and thus facilitate multi-label recognition. Extensive experiments and comparisons on two large-scale benchmarks (i.e., PASCAL VOC and MS-COCO) show that our model achieves superior performance over existing state-of-the-art methods in both performance and efficiency as well as explicitly identifying image-level semantic labels to specific object regions.

关键词

Image recognition, Reinforcement learning, Recurrent attention model

全文

PDF


MixedPeds: Pedestrian Detection in Unannotated Videos Using Synthetically Generated Human-Agents for Training

摘要

We present a new method for training pedestrian detectors on an unannotated set of images. We produce a mixed reality dataset that is composed of real-world background images and synthetically generated static human-agents. Our approach is general, robust, and makes few assumptions about the unannotated dataset. We automatically extract from the dataset: i) the vanishing point to calibrate the virtual camera, and ii) the pedestrians' scales to generate a Spawn Probability Map, which is a novel concept that guides our algorithm to place the pedestrians at appropriate locations. After putting synthetic human-agents in the unannotated images, we use these augmented images to train a Pedestrian Detector, with the annotations generated along with the synthetic agents. We conducted our experiments using Faster R-CNN by comparing the detection results on the unannotated dataset performed by the detector trained using our approach and detectors trained with other manually labeled datasets. We showed that our approach improves the average precision by 5-13% over these detectors.

关键词

Pedestrian Detection; Synthetically Generated Data

全文

PDF


Self-View Grounding Given a Narrated 360° Video

摘要

Narrated 360° videos are typically provided in many touring scenarios to mimic real-world experience. However, previous work has shown that smart assistance (i.e., providing visual guidance) can significantly help users to follow the Normal Field of View (NFoV) corresponding to the narrative.In this project, we aim at automatically grounding the NFoVs of a 360° video given subtitles of the narrative (referred to as ''NFoV-grounding"). We propose a novel Visual Grounding Model (VGM) to implicitly and efficiently predict the NFoVs given the video content and subtitles. Specifically, at each frame, we efficiently encode the panorama into feature map of candidate NFoVs using a Convolutional Neural Network (CNN) and the subtitles to the same hidden space using an RNN with Gated Recurrent Units (GRU). Then, we apply soft-attention on candidate NFoVs to trigger sentence decoder aiming to minimize the reconstruct loss between the generated and given sentence. Finally, we obtain the NFoV as the candidate NFoV with the maximum attention without any human supervision.To train VGM more robustly, we also generate a reverse sentence conditioning on one minus the soft-attention such that the attention focuses on candidate NFoVs less relevant to the given sentence. The negative log reconstruction loss of the reverse sentence (referred to as ''irrelevant loss") is jointly minimized to encourage the reverse sentence to be different from the given sentence. To evaluate our method, we collect the first narrated 360° videos dataset and achieve state-of-the-art NFoV-grounding performance.

关键词

360 video; visual grounding; weakly supervised learning

全文

PDF


Using Syntax to Ground Referring Expressions in Natural Images

摘要

We introduce GroundNet, a neural network for referring expression recognition---the task of localizing (or grounding) in an image the object referred to by a natural language expression. Our approach to this task is the first to rely on a syntactic analysis of the input referring expression in order to inform the structure of the computation graph. Given a parse tree for an input expression, we explicitly map the syntactic constituents and relationships present in the tree to a composed graph of neural modules that defines our architecture for performing localization. This syntax-based approach aids localization of both the target object and auxiliary supporting objects mentioned in the expression. As a result, GroundNet is more interpretable than previous methods: we can (1) determine which phrase of the referring expression points to which object in the image and (2) track how the localization of the target object is determined by the network. We study this property empirically by introducing a new set of annotations on the GoogleRef dataset to evaluate localization of supporting objects. Our experiments show that GroundNet achieves state-of-the-art accuracy in identifying supporting objects, while maintaining comparable performance in the localization of target objects.

关键词

syntax; grounding; language and vision; neural networks

全文

PDF


Acquiring Common Sense Spatial Knowledge Through Implicit Spatial Templates

摘要

Spatial understanding is a fundamental problem with wide-reaching real-world applications. The representation of spatial knowledge is often modeled with spatial templates, i.e., regions of acceptability of two objects under an explicit spatial relationship (e.g., "on," "below," etc.). In contrast with prior work that restricts spatial templates to explicit spatial prepositions (e.g., "glass on table"), here we extend this concept to implicit spatial language, i.e., those relationships (generally actions) for which the spatial arrangement of the objects is only implicitly implied (e.g., "man riding horse"). In contrast with explicit relationships, predicting spatial arrangements from implicit spatial language requires significant common sense spatial understanding. Here, we introduce the task of predicting spatial templates for two objects under a relationship, which can be seen as a spatial question-answering task with a (2D) continuous output ("where is the man w.r.t. a horse when the man is walking the horse?"). We present two simple neural-based models that leverage annotated images and structured text to learn this task. The good performance of these models reveals that spatial locations are to a large extent predictable from implicit spatial language. Crucially, the models attain similar performance in a challenging generalized setting, where the object-relation-object combinations (e.g., "man walking dog") have never been seen before. Next, we go one step further by presenting the models with unseen objects (e.g., "dog"). In this scenario, we show that leveraging word embeddings enables the models to output accurate spatial predictions, proving that the models acquire solid common sense spatial knowledge allowing for such generalization.

关键词

commonsense; spatial knowledge; spatial understanding; spatial templates; natural language understanding; visual understanding

全文

PDF


PixelLink: Detecting Scene Text via Instance Segmentation

摘要

Most state-of-the-art scene text detection algorithms are deep learning based methods that depend on bounding box regression and perform at least two kinds of predictions: text/non-text classification and location regression. Regression plays a key role in the acquisition of bounding boxes in these methods, but it is not indispensable because text/non-text prediction can also be considered as a kind of semantic segmentation that contains full location information in itself. However, text instances in scene images often lie very close to each other, making them very difficult to separate via semantic segmentation. Therefore, instance segmentation is needed to address this problem. In this paper, PixelLink, a novel scene text detection algorithm based on instance segmentation, is proposed. Text instances are first segmented out by linking pixels within the same instance together. Text bounding boxes are then extracted directly from the segmentation result without location regression. Experiments show that, compared with regression-based methods, PixelLink can achieve better or comparable performance on several benchmarks, while requiring many fewer training iterations and less training data.

关键词

scene text detection; robust reading; instance segmentation

全文

PDF


ExprGAN: Facial Expression Editing With Controllable Expression Intensity

摘要

Facial expression editing is a challenging task as it needs a high-level semantic understanding of the input face image. In conventional methods, either paired training data is required or the synthetic face’s resolution is low. Moreover,only the categories of facial expression can be changed. To address these limitations, we propose an Expression Generative Adversarial Network (ExprGAN) for photo-realistic facial expression editing with controllable expression intensity. An expression controller module is specially designed to learn an expressive and compact expression code in addition to the encoder-decoder network. This novel architecture enables the expression intensity to be continuously adjusted from low to high. We further show that our ExprGAN can be applied for other tasks, such as expression transfer, image retrieval, and data augmentation for training improved face expression recognition models. To tackle the small size of the training database, an effective incremental learning scheme is proposed. Quantitative and qualitative evaluations on the widely used Oulu-CASIA dataset demonstrate the effectiveness of ExprGAN.

关键词

facial expression editing, generative adversarial network,

全文

PDF


A Deep Cascade Network for Unaligned Face Attribute Classification

摘要

Humans focus attention on different face regions when recognizing face attributes. Most existing face attribute classification methods use the whole image as input. Moreover, some of these methods rely on fiducial landmarks to provide defined face parts. In this paper, we propose a cascade network that simultaneously learns to localize face regions specific to attributes and performs attribute classification without alignment. First, a weakly-supervised face region localization network is designed to automatically detect regions (or parts) specific to attributes. Then multiple part-based networks and a whole-image-based network are separately constructed and combined together by the region switch layer and attribute relation layer for final attribute classification. A multi-net learning method and hint-based model compression is further proposed to get an effective localization model and a compact classification model, respectively. Our approach achieves significantly better performance than state-of-the-art methods on unaligned CelebA dataset, reducing the classification error by 30.9%.

关键词

face attributes classification, weakly supervised localization

全文

PDF


Auto-Balanced Filter Pruning for Efficient Convolutional Neural Networks

摘要

In recent years considerable research efforts have been devoted to compression techniques of convolutional neural networks (CNNs). Many works so far have focused on CNN connection pruning methods which produce sparse parameter tensors in convolutional or fully-connected layers. It has been demonstrated in several studies that even simple methods can effectively eliminate connections of a CNN. However, since these methods make parameter tensors just sparser but no smaller, the compression may not transfer directly to acceleration without support from specially designed hardware. In this paper, we propose an iterative approach named Auto-balanced Filter Pruning, where we pre-train the network in an innovative auto-balanced way to transfer the representational capacity of its convolutional layers to a fraction of the filters, prune the redundant ones, then re-train it to restore the accuracy. In this way, a smaller version of the original network is learned and the floating-point operations (FLOPs) are reduced. By applying this method on several common CNNs, we show that a large portion of the filters can be discarded without obvious accuracy drop, leading to significant reduction of computational burdens. Concretely, we reduce the inference cost of LeNet-5 on MNIST, VGG-16 and ResNet-56 on CIFAR-10 by 95.1%, 79.7% and 60.9%, respectively.

关键词

pruning; convolutional neural network; computer vision

全文

PDF


Hierarchical Nonlinear Orthogonal Adaptive-Subspace Self-Organizing Map Based Feature Extraction for Human Action Recognition

摘要

Feature extraction is a critical step in the task of action recognition. Hand-crafted features are often restricted because of their fixed forms and deep learning features are more effective but need large-scale labeled data for training. In this paper, we propose a new hierarchical Nonlinear Orthogonal Adaptive-Subspace Self-Organizing Map(NOASSOM) to adaptively and learn effective features from data without supervision. NOASSOM is extended from Adaptive-Subspace Self-Organizing Map (ASSOM) which only deals with linear data and is trained with supervision by the labeled data. Firstly, by adding a nonlinear orthogonal map layer, NOASSOM is able to handle the nonlinear input data and it avoids defining the specific form of the nonlinear orthogonal map by a kernel trick. Secondly, we modify loss function of ASSOM such that every input sample is used to train model individually. In this way, NOASSOM effectively learns the statistic patterns from data without supervision. Thirdly, we propose a hierarchical NOASSOM to extract more representative features. Finally, we apply the proposed hierarchical NOASSOM to efficiently describe the appearance and motion information around trajectories for action recognition. Experimental results on widely used datasets show that our method has superior performance than many state-of-the-art hand-crafted features and deep learning features based methods.

关键词

Feature Extraction; Action Recognition; Machine Learning

全文

PDF


Self-Reinforced Cascaded Regression for Face Alignment

摘要

Cascaded regression is prevailing in face alignment thanks to its accurate and robust localization of facial landmarks, but typically demands numerous annotated training examples of low discrepancy between shape-indexed features and shape updates. In this paper, we propose a self-reinforced strategy that iteratively expands the quantity and improves the quality of training examples, thus upgrading the performance of cascaded regression itself. The reinforced term evaluates the example quality upon the consistence on both local appearance and global geometry of human faces, and constitutes the example evolution by the philosophy of "survival of the fittest." We train a set of discriminative classifiers, each associated with one landmark label, to prune those examples with inconsistent local appearance, and further validate the geometric relationship among groups of labeled landmarks against the common global geometry derived from a projective invariant. We embed this generic strategy into two typical cascaded regressions, and the alignment results on several benchmark data sets demonstrate the effectiveness of training regressions with automatic example prediction and evolution starting from a small subset.

关键词

Face alignment; Cascaded regression; Self-reinforcement

全文

PDF


Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation

摘要

In this paper, we propose a pose grammar to tackle the problem of 3D human pose estimation. Our model directly takes 2D pose as input and learns a generalized 2D-3D mapping function. The proposed model consists of a base network which efficiently captures pose-aligned features and a hierarchy of Bi-directional RNNs (BRNN) on the top to explicitly incorporate a set of knowledge regarding human body configuration (i.e., kinematics, symmetry, motor coordination). The proposed model thus enforces high-level constraints over human poses. In learning, we develop a pose sample simulator to augment training samples in virtual camera views, which further improves our model generalizability. We validate our method on public 3D human pose benchmarks and propose a new evaluation protocol working on cross-view setting to verify the generalization capability of different methods. We empirically observe that most state-of-the-art methods encounter difficulty under such setting while our method can well handle such challenges.

关键词

3D Pose Estimation; Deep Grammar Network; Pose Grammar; Deep Neural Network

全文

PDF


Unravelling Robustness of Deep Learning Based Face Recognition Against Adversarial Attacks

摘要

Deep neural network (DNN) architecture based models have high expressive power and learning capacity. However, they are essentially a black box method since it is not easy to mathematically formulate the functions that are learned within its many layers of representation. Realizing this, many researchers have started to design methods to exploit the drawbacks of deep learning based algorithms questioning their robustness and exposing their singularities. In this paper, we attempt to unravel three aspects related to the robustness of DNNs for face recognition: (i) assessing the impact of deep architectures for face recognition in terms of vulnerabilities to attacks inspired by commonly observed distortions in the real world that are well handled by shallow learning methods along with learning based adversaries; (ii) detecting the singularities by characterizing abnormal filter response behavior in the hidden layers of deep networks; and (iii) making corrections to the processing pipeline to alleviate the problem. Our experimental evaluation using multiple open-source DNN-based face recognition networks, including OpenFace and VGG-Face, and two publicly available databases (MEDS and PaSC) demonstrates that the performance of deep learning based face recognition algorithms can suffer greatly in the presence of such distortions. The proposed method is also compared with existing detection algorithms and the results show that it is able to detect the attacks with very high accuracy by suitably designing a classifier using the response of the hidden layers in the network. Finally, we present several effective countermeasures to mitigate the impact of adversarial attacks and improve the overall robustness of DNN-based face recognition.

关键词

CNN; Deep Learning; Adversarial Attacks

全文

PDF


Stack-Captioning: Coarse-to-Fine Learning for Image Captioning

摘要

The existing image captioning approaches typically train a one-stage sentence decoder, which is difficult to generate rich fine-grained descriptions. On the other hand, multi-stage image caption model is hard to train due to the vanishing gradient problem. In this paper, we propose a coarse-to-fine multi-stage prediction framework for image captioning, composed of multiple decoders each of which operates on the output of the previous stage, producing increasingly refined image descriptions. Our proposed learning approach addresses the difficulty of vanishing gradients during training by providing a learning objective function that enforces intermediate supervisions. Particularly, we optimize our model with a reinforcement learning approach which utilizes the output of each intermediate decoder's test-time inference algorithm as well as the output of its preceding decoder to normalize the rewards, which simultaneously solves the well-known exposure bias problem and the loss-evaluation mismatch problem. We extensively evaluate the proposed approach on MSCOCO and show that our approach can achieve the state-of-the-art performance.

关键词

VIS

全文

PDF


Hierarchical LSTM for Sign Language Translation

摘要

Continuous Sign Language Translation (SLT) is a challenging task due to its specific linguistics under sequential gesture variation without word alignment. Current hybrid HMM and CTC (Connectionist temporal classification) based models are proposed to solve frame or word level alignment. They may fail to tackle the cases with messing word order corresponding to visual content in sentences. To solve the issue, this paper proposes a hierarchical-LSTM (HLSTM) encoder-decoder model with visual content and word embedding for SLT. It tackles different granularities by conveying spatio-temporal transitions among frames, clips and viseme units. It firstly explores spatio-temporal cues of video clips by 3D CNN and packs appropriate visemes by online key clip mining with adaptive variable-length. After pooling on recurrent outputs of the top layer of HLSTM, a temporal attention-aware weighting mechanism is proposed to balance the intrinsic relationship among viseme source positions. At last, another two LSTM layers are used to separately recurse viseme vectors and translate semantic. After preserving original visual content by 3D CNN and the top layer of HLSTM, it shortens the encoding time step of the bottom two LSTM layers with less computational complexity while attaining more nonlinearity. Our proposed model exhibits promising performance on singer-independent test with seen sentences and also outperforms the comparison algorithms on unseen sentences.

关键词

全文

PDF


Learning Coarse-to-Fine Structured Feature Embedding for Vehicle Re-Identification

摘要

Vehicle re-identification (re-ID) is to identify the same vehicle across different cameras. It’s a significant but challenging topic, which has received little attention due to the complex intra-class and inter-class variation of vehicle images and the lack of large-scale vehicle re-ID dataset. Previous methods focus on pulling images from different vehicles apart but neglect the discrimination between vehicles from different vehicle models, which is actually quite important to obtain a correct ranking order for vehicle re-ID. In this paper, we learn a structured feature embedding for vehicle re-ID with a novel coarse-to-fine ranking loss to pull images of the same vehicle as close as possible and achieve discrimination between images from different vehicles as well as vehicles from different vehicle models. In the learnt feature space, both intra-class compactness and inter-class distinction are well guaranteed and the Euclidean distance between features directly reflects the semantic similarity of vehicle images. Furthermore, we build so far the largest vehicle re-ID dataset "Vehicle-1M," which involves nearly 1 million images captured in various surveillance scenarios. Experimental results on "Vehicle-1M" and "VehicleID" demonstrate the superiority of our proposed approach.

关键词

Vehicle re-identification;CNN;ranking loss

全文

PDF


Residual Encoder Decoder Network and Adaptive Prior for Face Parsing

摘要

Face Parsing assigns every pixel in a facial image with a semantic label, which could be applied in various applications including face recognition, facial beautification, affective computing and animation. While lots of progress have been made in this field, current state-of-the-art methods still fail to extract real effective feature and restore accurate score map, especially for those facial parts which have large variations of deformation and fairly similar appearance, e.g. mouth, eyes and thin eyebrows. In this paper, we propose a novel pixel-wise face parsing method called Residual Encoder Decoder Network (RED-Net), which combines a feature-rich encoder-decoder framework with adaptive prior mechanism. Our encoder-decoder framework extracts feature with ResNet and decodes the feature by elaborately fusing the residual architectures in to deconvolution. This framework learns more effective feature comparing to that learnt by decoding with interpolation or classic deconvolution operations. To overcome the appearance ambiguity between facial parts, an adaptive prior mechanism is proposed in term of the decoder prediction confidence, allowing refining the final result. The experimental results on two public datasets demonstrate that our method outperforms the state-of-the-arts significantly, achieving improvements of F-measure from 0.854 to 0.905 on Helen dataset, and pixel accuracy from 95.12% to 97.59% on the LFW dataset. In particular, convincing qualitative examples show that our method parses eye, eyebrow, and lip regins more accurately.

关键词

face parsing; encoder decoder; redisual network; adaptive prior

全文

PDF


Zero-Shot Learning With Attribute Selection

摘要

Zero-shot learning (ZSL) is regarded as an effective way to construct classification models for target classes which have no labeled samples available. The basic framework is to transfer knowledge from (different) auxiliary source classes having sufficient labeled samples with some attributes shared by target and source classes as bridge. Attributes play an important role in ZSL but they have not gained sufficient attention in recent years. Previous works mostly assume attributes are perfect and treat each attribute equally. However, as shown in this paper, different attributes have different properties, such as their class distribution, variance, and entropy, which may have considerable impact on ZSL accuracy if treated equally. Based on this observation, in this paper we propose to use a subset of attributes, instead of the whole set, for building ZSL models. The attribute selection is conducted by considering the information amount and predictability under a novel joint optimization framework. To our knowledge, this is the first work that notices the influence of attributes themselves and proposes to use a refined attribute set for ZSL. Since our approach focuses on selecting good attributes for ZSL, it can be combined to any attribute based ZSL approaches so as to augment their performance. Experiments on four ZSL benchmarks demonstrate that our approach can improve zero-shot classification accuracy and yield state-of-the-art results.

关键词

zero-shot learning, selection, attribute

全文

PDF


Doing the Best We Can With What We Have: Multi-Label Balancing With Selective Learning for Attribute Prediction

摘要

Attributes are human describable features, which have been used successfully for face, object, and activity recognition. Facial attributes are intuitive descriptions of faces and have proven to be very useful in face recognition and verification. Despite their usefulness, to date there is only one large-scale facial attribute dataset, CelebA. Impressive results have been achieved on this dataset, but it exhibits a variety of very significant biases. As CelebA contains mostly frontal idealized images of celebrities, it is difficult to generalize a model trained on this data for use on another dataset (of non celebrities). A typical approach to dealing with imbalanced data involves sampling the data in order to balance the positive and negative labels, however, with a multi-label problem this becomes a non-trivial task. By sampling to balance one label, we affect the distribution of other labels in the data. To address this problem, we introduce a novel Selective Learning method for deep networks which adaptively balances the data in each batch according to the desired distribution for each label. The bias in CelebA can be corrected for in this way, allowing the network to learn a more robust attribute model. We argue that without this multi-label balancing, the network cannot learn to accurately predict attributes that are poorly represented in CelebA. We demonstrate the effectiveness of our method on the problem of facial attribute prediction on CelebA, LFWA, and the new University of Maryland Attribute Evaluation Dataset (UMD-AED), outperforming the state-of-the-art on each dataset.

关键词

Attributes; Face Recognition; CNN

全文

PDF


CMCGAN: A Uniform Framework for Cross-Modal Visual-Audio Mutual Generation

摘要

Visual and audio modalities are two symbiotic modalities underlying videos, which contain both common and complementary information. If they can be mined and fused sufficiently, performances of related video tasks can be significantly enhanced. However, due to the environmental interference or sensor fault, sometimes, only one modality exists while the other is abandoned or missing. By recovering the missing modality from the existing one based on the common information shared between them and the prior information of the specific modality, great bonus will be gained for various vision tasks. In this paper, we propose a Cross-Modal Cycle Generative Adversarial Network (CMCGAN) to handle cross-modal visual-audio mutual generation. Specifically, CMCGAN is composed of four kinds of subnetworks: audio-to-visual, visual-to-audio, audio-to-audio and visual-to-visual subnetworks respectively, which are organized in a cycle architecture. CMCGAN has several remarkable advantages. Firstly, CMCGAN unifies visual-audio mutual generation into a common framework by a joint corresponding adversarial loss. Secondly, through introducing a latent vector with Gaussian distribution, CMCGAN can handle dimension and structure asymmetry over visual and audio modalities effectively. Thirdly, CMCGAN can be trained end-to-end to achieve better convenience. Benefiting from CMCGAN, we develop a dynamic multimodal classification network to handle the modality missing problem. Abundant experiments have been conducted and validate that CMCGAN obtains the state-of-the-art cross-modal visual-audio generation results. Furthermore, it is shown that the generated modality achieves comparable effects with those of original modality, which demonstrates the effectiveness and advantages of our proposed method.

关键词

cross-modal; visual-audio generation

全文

PDF


Integrating Both Visual and Audio Cues for Enhanced Video Caption

摘要

Video caption refers to generating a descriptive sentence for a specific short video clip automatically, which has achieved remarkable success recently. However, most of the existing methods focus more on visual information while ignoring the synchronized audio cues. We propose three multimodal deep fusion strategies to maximize the benefits of visual-audio resonance information. The first one explores the impact on cross-modalities feature fusion from low to high order. The second establishes the visual-audio short-term dependency by sharing weights of corresponding front-end networks. The third extends the temporal dependency to long-term through sharing multimodal memory across visual and audio modalities. Extensive experiments have validated the effectiveness of our three cross-modalities fusion strategies on two benchmark datasets, including Microsoft Research Video to Text (MSRVTT) and Microsoft Video Description (MSVD). It is worth mentioning that sharing weight can coordinate visual- audio feature fusion effectively and achieve the state-of-art performance on both BELU and METEOR metrics. Furthermore, we first propose a dynamic multimodal feature fusion framework to deal with the part modalities missing case. Experimental results demonstrate that even in the audio absence mode, we can still obtain comparable results with the aid of the additional audio modality inference module.

关键词

video caption; visual and audio feature fusion; modality absent

全文

PDF


Merge or Not? Learning to Group Faces via Imitation Learning

摘要

Face grouping remains a challenging problem despite the remarkable capability of deep learning approaches in learning face representation. In particular, grouping results can still be egregious given profile faces and a large number of uninteresting faces and noisy detections. Often, a user needs to correct the erroneous grouping manually. In this study, we formulate a novel face grouping framework that learns clustering strategy from ground-truth simulated behavior. This is achieved through imitation learning (a.k.a apprenticeship learning or learning by watching) via inverse reinforcement learning (IRL). In contrast to existing clustering approaches that group instances by similarity, our framework makes sequential decision to dynamically decide when to merge two face instances/groups driven by short- and long-term rewards. Extensive experiments on three benchmark datasets show that our framework outperforms unsupervised and supervised baselines.

关键词

Reinforcement Learning; Face Cluster; Face Recognition

全文

PDF


Unsupervised Deep Learning of Mid-Level Video Representation for Action Recognition

摘要

Current deep learning methods for action recognition rely heavily on large scale labeled video datasets. Manually annotating video datasets is laborious and may introduce unexpected bias to train complex deep models for learning video representation. In this paper, we propose an unsupervised deep learning method which employs unlabeled local spatial-temporal volumes extracted from action videos to learn midlevel video representation for action recognition. Specifically, our method simultaneously discovers mid-level semantic concepts by discriminative clustering and optimizes local spatial-temporal features by two relatively small and simple deep neural networks. The clustering generates semantic visual concepts that guide the training of the deep networks, and the networks in turn guarantee the robustness of the semantic concepts. Experiments on the HMDB51 and the UCF101 datasets demonstrate the superiority of the proposed method, even over several supervised learning methods.

关键词

全文

PDF


Dual-Reference Face Retrieval

摘要

Face retrieval has received much attention over the past few decades, and many efforts have been made in retrieving face images against pose, illumination, and expression variations. However, the conventional works fail to meet the requirements of a potential and novel task---retrieving a person's face image at a specific age, especially when the specific "age" is not given as a numeral, i.e. "retrieving someone's image at the similar age period shown by another person's image." To tackle this problem, we propose a dual reference face retrieval framework in this paper, where the system takes two inputs: an identity reference image which indicates the target identity and an age reference image which reflects the target age. In our framework, the raw images are first projected on a joint manifold, which preserves both the age and identity locality. Then two similarity metrics of age and identity are exploited and optimized by utilizing our proposed quartet-based model. The experiments show promising results, outperforming hierarchical methods.

关键词

Face Retrieval

全文

PDF


Facial Landmarks Detection by Self-Iterative Regression Based Landmarks-Attention Network

摘要

Cascaded Regression (CR) based methods have been proposed to solve facial landmarks detection problem, which learn a series of descent directions by multiple cascaded regressors separately trained in coarse and fine stages. They outperform the traditional gradient descent based methods in both accuracy and running speed. However, cascaded regression is not robust enough because each regressor's training data comes from the output of previous regressor. Moreover, training multiple regressors requires lots of computing resources, especially for deep learning based methods. In this paper, we develop a Self-Iterative Regression (SIR) framework to improve the model efficiency. Only one self-iterative regressor is trained to learn the descent directions for samples from coarse stages to fine stages, and parameters are iteratively updated by the same regressor. Specifically, we proposed Landmarks-Attention Network (LAN) as our regressor, which concurrently learns features around each landmark and obtains the holistic location increment. By doing so, not only the rest of regressors are removed to simplify the training process, but the number of model parameters is significantly decreased. The experiments demonstrate that with only 3.72M model parameters, our proposed method achieves the state-of-the-art performance.

关键词

facial landmarks detection;non-linear least squares optimization

全文

PDF


Learning Adaptive Hidden Layers for Mobile Gesture Recognition

摘要

This paper addresses two obstacles hindering advances in accurate gesture recognition on mobile devices. First, gesture recognition performance is highly dependent on feature selection, but optimal features typically vary from gesture to gesture. Second, diverse user behaviors and mobile environments result in extremely large intra-class variations. We tackle these issues by introducing a new network layer, called an adaptive hidden layer (AHL), to generalize a hidden layer in deep neural networks and dynamically generate an activation map conditioned on the input. To this end, an AHL is composed of multiple neuron groups and an extra selector. The former compiles multi-modal features captured by mobile sensors, while the latter adaptively picks a plausible group for each input sample. The AHL is end-to-end trainable and can generalize an arbitrary subset of hidden layers. Through a series of AHLs, the great expressive power from exponentially many forward paths allows us to choose proper multi-modal features in a sample-specific fashion and resolve the problems caused by the unfavorable variations in mobile gesture recognition. The proposed approach is evaluated on a benchmark for gesture recognition and a newly collected dataset. Superior performance demonstrates its effectiveness.

关键词

deep learning; gesture recognition; adaptive hidden layer

全文

PDF


Recurrently Aggregating Deep Features for Salient Object Detection

摘要

Salient object detection is a fundamental yet challenging problem in computer vision, aiming to highlight the most visually distinctive objects or regions in an image. Recent works benefit from the development of fully convolutional neural networks (FCNs) and achieve great success by integrating features from multiple layers of FCNs. However, the integrated features tend to include non-salient regions (due to low level features of the FCN) or lost details of salient objects (due to high level features of the FCN) when producing the saliency maps. In this paper, we develop a novel deep saliency network equipped with recurrently aggregated deep features (RADF) to more accurately detect salient objects from an image by fully exploiting the complementary saliency information captured in different layers. The RADF utilizes the multi-level features integrated from different layers of a FCN to recurrently refine the features at each layer, suppressing the non-salient noise at low-level of the FCN and increasing more salient details into features at high layers. We perform experiments to evaluate the effectiveness of the proposed network on 5 famous saliency detection benchmarks and compare it with 15 state-of-the-art methods. Our method ranks first in 4 of the 5 datasets and second in the left dataset.

关键词

Salient Object; Deep Features; Saliency Detection

全文

PDF


SAP: Self-Adaptive Proposal Model for Temporal Action Detection Based on Reinforcement Learning

摘要

Existing action detection algorithms usually generate action proposals through an extensive search over the video at multiple temporal scales, which brings about huge computational overhead and deviates from the human perception procedure. We argue that the process of detecting actions should be naturally one of observation and refinement: observe the current window and refine the span of attended window to cover true action regions. In this paper, we propose a Self-Adaptive Proposal (SAP) model that learns to find actions through continuously adjusting the temporal bounds in a self-adaptive way. The whole process can be deemed as an agent, which is firstly placed at the beginning of the video and traverse the whole video by adopting a sequence of transformations on the current attended region to discover actions according to a learned policy. We utilize reinforcement learning, especially the Deep Q-learning algorithm to learn the agent’s decision policy. In addition, we use temporal pooling operation to extract more effective feature representation for the long temporal window, and design a regression network to adjust the position offsets between predicted results and the ground truth. Experiment results on THUMOS’14 validate the effectiveness of SAP, which can achieve competitive performance with current action detection algorithms via much fewer proposals.

关键词

Computer vision; Action detection; Reinforcement learning

全文

PDF


Learning to Guide Decoding for Image Captioning

摘要

Recently, much advance has been made in image captioning, and an encoder-decoder framework has achieved outstanding performance for this task. In this paper, we propose an extension of the encoder-decoder framework by adding a component called guiding network. The guiding network models the attribute properties of input images, and its output is leveraged to compose the input of the decoder at each time step. The guiding network can be plugged into the current encoder-decoder framework and trained in an end-to-end manner. Hence, the guiding vector can be adaptively learned according to the signal from the decoder, making itself to embed information from both image and language. Additionally, discriminative supervision can be employed to further improve the quality of guidance. The advantages of our proposed approach are verified by experiments carried out on the MS COCO dataset.

关键词

image captioning

全文

PDF


Deep Low-Resolution Person Re-Identification

摘要

Person images captured by public surveillance cameras often have low resolutions (LR) in addition to uncontrolled pose variations, background clutters and occlusions. This gives rise to the resolution mismatch problem when matched against the high resolution (HR) gallery images (typically available in enrolment), which adversely affects the performance of person re-identification (re-id) that aims to associate images of the same person captured at different locations and different time. Most existing re-id methods either ignore this problem or simply upscale LR images. In this work, we address this problem by developing a novel approach called Super-resolution and Identity joiNt learninG (SING) to simultaneously optimise image super-resolution and person re-id matching. This approach is instantiated by designing a hybrid deep Convolutional Neural Network for improving cross-resolution re-id performance. We further introduce an adaptive fusion algorithm for accommodating multi-resolution LR images. Extensive evaluations show the advantages of our method over related state-of-the-art re-id and super-resolution methods on cross-resolution re-id benchmarks.

关键词

全文

PDF


Co-Domain Embedding Using Deep Quadruplet Networks for Unseen Traffic Sign Recognition

摘要

Recent advances in visual recognition show overarching success by virtue of large amounts of supervised data. However, the acquisition of a large supervised dataset is often challenging. This is also true for intelligent transportation applications, i.e., traffic sign recognition. For example, a model trained with data of one country may not be easily generalized to another country without much data. We propose a novel feature embedding scheme for unseen class classification when the representative class template is given. Traffic signs, unlike other objects, have official images. We perform co-domain embedding using a quadruple relationship from real and synthetic domains. Our quadruplet network fully utilizes the explicit pairwise similarity relationships among samples from different domains. We validate our method on three datasets with two experiments involving one-shot classification and feature generalization. The results show that the proposed method outperforms competing approaches on both seen and unseen classes.

关键词

Domain adaptation; Data Imbalance

全文

PDF


Multispectral Transfer Network: Unsupervised Depth Estimation for All-Day Vision

摘要

To understand the real-world, it is essential to perceive in all-day conditions including cases which are not suitable for RGB sensors, especially at night. Beyond these limitations, the innovation introduced here is a multispectral solution in the form of depth estimation from a thermal sensor without an additional depth sensor.Based on an analysis of multispectral properties and the relevance to depth predictions, we propose an efficient and novel multi-task framework called the Multispectral Transfer Network (MTN) to estimate a depth image from a single thermal image. By exploiting geometric priors and chromaticity clues, our model can generate a pixel-wise depth image in an unsupervised manner. Moreover, we propose a new type of multitask module called Interleaver as a means of incorporating the chromaticity and fine details of skip-connections into the depth estimation framework without sharing feature layers. Lastly, we explain a novel technical means of stably training and covering large disparities and extending thermal images to data-driven methods for all-day conditions. In experiments, we demonstrate the better performance and generalization of depth estimation through the proposed multispectral stereo dataset, including various driving conditions.

关键词

multispectral learning; depth estimation; all-day vision; deep neural network; transfer learning; multi-task learning

全文

PDF


Generating Triples With Adversarial Networks for Scene Graph Construction

摘要

Driven by successes in deep learning, computer vision research has begun to move beyond object detection and image classification to more sophisticated tasks like image captioning or visual question answering. Motivating such endeavors is the desire for models to capture not only objects present in an image, but more fine-grained aspects of a scene such as relationships between objects and their attributes. Scene graphs provide a formal construct for capturing these aspects of an image. Despite this, there have been only a few recent efforts to generate scene graphs from imagery. Previous works limit themselves to settings where bounding box information is available at train time and do not attempt to generate scene graphs with attributes. In this paper we propose a method, based on recent advancements in Generative Adversarial Networks, to overcome these deficiencies. We take the approach of first generating small subgraphs, each describing a single statement about a scene from a specific region of the input image chosen using an attention mechanism. By doing so, our method is able to produce portions of the scene graphs with attribute information without the need for bounding box labels. Then, the complete scene graph is constructed from these subgraphs. We show that our model improves upon prior work in scene graph generation on state-of-the-art data sets and accepted metrics. Further, we demonstrate that our model is capable of handling a larger vocabulary size than prior work has attempted.

关键词

Deep Learning; Computer Vision; Adversarial Learning

全文

PDF


Action Prediction From Videos via Memorizing Hard-to-Predict Samples

摘要

Action prediction based on video is an important problem in computer vision field with many applications, such as preventing accidents and criminal activities. It's challenging to predict actions at the early stage because of the large variations between early observed videos and complete ones. Besides, intra-class variations cause confusions to the predictors as well. In this paper, we propose a mem-LSTM model to predict actions in the early stage, in which a memory module is introduced to record several "hard-to-predict" samples and a variety of early observations. Our method uses Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM) to model partial observed video input. We augment LSTM with a memory module to remember challenging video instances. With the memory module, our mem-LSTM model not only achieves impressive performance in the early stage but also makes predictions without the prior knowledge of observation ratio. Information in future frames is also utilized using a bi-directional layer of LSTM. Experiments on UCF-101 and Sports-1M datasets show that our method outperforms state-of-the-art methods.

关键词

Action prediction; Action recognition;Video analysis; Deep learning

全文

PDF


Robust Collaborative Discriminative Learning for RGB-Infrared Tracking

摘要

Tracking target of interests is an important step for motion perception in intelligent video surveillance systems. While most recently developed tracking algorithms are grounded in RGB image sequences, it should be noted that information from RGB modality is not always reliable (e.g. in a dark environment with poor lighting condition), which urges the need to integrate information from infrared modality for effective tracking because of the insensitivity to illumination condition of infrared thermal camera. However, several issues encountered during the tracking process limit the fusing performance of these heterogeneous modalities: 1) the cross-modality discrepancy of visual and motion characteristics, 2) the uncertainty of degree of reliability in different modalities, and 3) large target appearance variations and background distractions within each modality. To address these issues, this paper proposes a novel and optimal discriminative learning framework for multi-modality tracking. In particular, the proposed discriminative learning framework is able to: 1) jointly eliminate outlier samples caused by large variations and learn discriminability-consistent features from heterogeneous modalities, and 2) collaboratively perform modality reliability measurement and target-background separation. Extensive experiments on RGB-infrared image sequences demonstrate the effectiveness of the proposed method.

关键词

visual tracking

全文

PDF


End-to-End United Video Dehazing and Detection

摘要

The recent development of CNN-based image dehazing has revealed the effectiveness of end-to-end modeling. However, extending the idea to end-to-end video dehazing has not been explored yet. In this paper, we propose an End-to-End Video Dehazing Network (EVD-Net), to exploit the temporal consistency between consecutive video frames. A thorough study has been conducted over a number of structure options, to identify the best temporal fusion strategy. Furthermore, we build an End-to-End United Video Dehazing and Detection Network (EVDD-Net), which concatenates and jointly trains EVD-Net with a video object detection model. The resulting augmented end-to-end pipeline has demonstrated much more stable and accurate detection results in hazy video.

关键词

Vision; Machine Learning Applications

全文

PDF


Weakly Supervised Salient Object Detection Using Image Labels

摘要

Deep learning based salient object detection has recently achieved great success with its performance greatly outperforms any other unsupervised methods. However, annotating per-pixel saliency masks is a tedious and inefficient procedure. In this paper, we note that superior salient object detection can be obtained by iteratively mining and correcting the labeling ambiguity on saliency maps from traditional unsupervised methods. We propose to use the combination of a coarse salient object activation map from the classification network and saliency maps generated from unsupervised methods as pixel-level annotation, and develop a simple yet very effective algorithm to train fully convolutional networks for salient object detection supervised by these noisy annotations. Our algorithm is based on alternately exploiting a graphical model and training a fully convolutional network for model updating. The graphical model corrects the internal labeling ambiguity through spatial consistency and structure preserving while the fully convolutional network helps to correct the cross-image semantic ambiguity and simultaneously update the coarse activation map for next iteration. Experimental results demonstrate that our proposed method greatly outperforms all state-of-the-art unsupervised saliency detection methods and can be comparable to the current best strongly-supervised methods training with thousands of pixel-level saliency map annotations on all public benchmarks.

关键词

Saliency;Weakly supervised learning

全文

PDF


Brute-Force Facial Landmark Analysis With a 140,000-Way Classifier

摘要

We propose a simple approach to visual alignment, focusing on the illustrative task of facial landmark estimation. While most prior work treats this as a regression problem, we instead formulate it as a discrete K-way classification task, where a classifier is trained to return one of K discrete alignments. One crucial benefit of a classifier is the ability to report back a (softmax) distribution over putative alignments. We demonstrate that this distribution is a rich representation that can be marginalized (to generate uncertainty estimates over groups of landmarks) and conditioned on (to incorporate top-down context, provided by temporal constraints in a video stream or an interactive human user). Such capabilities are difficult to integrate into classic regression-based approaches. We study performance as a function of the number of classes K, including the extreme "exemplar class" setting where K is equal to the number of training examples (140K in our setting). Perhaps surprisingly, we show that classifiers can still be learned in this setting. When compared to prior work in classification, our K is unprecedentedly large, including many "fine-grained" classes that are very similar. We address these issues by using a multi-label loss function that allows for training examples to be non-uniformly shared across discrete classes. We perform a comprehensive experimental analysis of our method on standard benchmarks, demonstrating state-of-the-art results for facial alignment in videos.

关键词

face alignment; facial landmark; uncertainty; regression as classification

全文

PDF


DF2Net: Discriminative Feature Learning and Fusion Network for RGB-D Indoor Scene Classification

摘要

This paper focuses on the task of RGB-D indoor scene classification. It is a very challenging task due to two folds. 1) Learning robust representation for indoor scene is difficult because of various objects and layouts. 2) Fusing the complementary cues in RGB and Depth is nontrivial since there are large semantic gaps between the two modalities. Most existing works learn representation for classification by training a deep network with softmax loss and fuse the two modalities by simply concatenating the features of them. However, these pipelines do not explicitly consider intra-class and inter-class similarity as well as inter-modal intrinsic relationships. To address these problems, this paper proposes a Discriminative Feature Learning and Fusion Network (DF2Net) with two-stage training. In the first stage, to better represent scene in each modality, a deep multi-task network is constructed to simultaneously minimize the structured loss and the softmax loss. In the second stage, we design a novel discriminative fusion network which is able to learn correlative features of multiple modalities and distinctive features of each modality. Extensive analysis and experiments on SUN RGB-D Dataset and NYU Depth Dataset V2 show the superiority of DF2Net over other state-of-the-art methods in RGB-D indoor scene classification task.

关键词

RGB-D; scene recognition; multi-modal

全文

PDF


Deep Semantic Structural Constraints for Zero-Shot Learning

摘要

Zero-shot learning aims to classify unseen image categories by learning a visual-semantic embedding space. In most cases, the traditional methods adopt a separated two-step pipeline that extracts image features are utilized to learn the embedding space. It leads to the lack of specific structural semantic information of image features for zero-shot learning task. In this paper, we propose an end-to-end trainable Deep Semantic Structural Constraints model to address this issue. The proposed model contains the Image Feature Structure constraint and the Semantic Embedding Structure constraint, which aim to learn structure-preserving image features and endue the learned embedding space with stronger generalization ability respectively. With the assistance of semantic structural information, the model gains more auxiliary clues for zero-shot learning. The state-of-the-art performance certifies the effectiveness of our proposed method.

关键词

全文

PDF


Anti-Makeup: Learning A Bi-Level Adversarial Network for Makeup-Invariant Face Verification

摘要

Makeup is widely used to improve facial attractiveness and is well accepted by the public. However, different makeup styles will result in significant facial appearance changes. It remains a challenging problem to match makeup and non-makeup face images. This paper proposes a learning from generation approach for makeup-invariant face verification by introducing a bi-level adversarial network (BLAN). To alleviate the negative effects from makeup, we first generate non-makeup images from makeup ones, and then use the synthesized non-makeup images for further verification. Two adversarial networks in BLAN are integrated in an end-to-end deep network, with the one on pixel level for reconstructing appealing facial images and the other on feature level for preserving identity information. These two networks jointly reduce the sensing gap between makeup and non-makeup images. Moreover, we make the generator well constrained by incorporating multiple perceptual losses. Experimental results on three benchmark makeup face datasets demonstrate that our method achieves state-of-the-art verification accuracy across makeup status and can produce photo-realistic non-makeup face images.

关键词

全文

PDF


Video Generation From Text

摘要

Generating videos from text has proven to be a significant challenge for existing generative models. We tackle this problem by training a conditional generative model to extract both static and dynamic information from text. This is manifested in a hybrid framework, employing a Variational Autoencoder (VAE) and a Generative Adversarial Network (GAN). The static features, called "gist," are used to sketch text-conditioned background color and object layout structure. Dynamic features are considered by transforming input text into an image filter. To obtain a large amount of data for training the deep-learning model, we develop a method to automatically create a matched text-video corpus from publicly available online videos. Experimental results show that the proposed framework generates plausible and diverse short-duration smooth videos, while accurately reflecting the input text information. It significantly outperforms baseline models that directly adapt text-to-image generation procedures to produce videos. Performance is evaluated both visually and by adapting the inception score used to evaluate image generation in GANs.

关键词

video generation; variational autoencoder; generative adversarial network

全文

PDF


R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

摘要

Region based detectors like Faster R-CNN and R-FCN have achieved leading performance on object detection benchmarks. However, in Faster R-CNN, RoI pooling is used to extract feature of each region, which might harm the classification as the RoI pooling loses spatial resolution. Also it gets slow when a large number of proposals are utilized. R-FCN is a fully convolutional structure that uses a position-sensitive pooling layer to extract prediction score of each region, which speeds up network by sharing computation of RoIs and prevents the feature map from losing information in RoI-pooling. But R-FCN can not benefit from fully connected layer (or global average pooling), which enables Faster R-CNN to utilize global context information. In this paper, we propose R-FCN++ to address this issue in two-fold: first we involve Global Context Module to improve the classification score maps by adopting large, separable convolutional kernels. Second we introduce a new pooling method to better extract scores from the score maps, by using row-wise or column-wise max pooling. Our approach achieves state-of-the-art single-model results on both Pascal VOC and MS COCO object detection benchmarks, 87.3% on Pascal VOC 2012 test dataset and 42.3% on COCO 2015 test-dev dataset. Code will be made publicly available.

关键词

Computer Vision;Object Detection

全文

PDF


Multi-Rate Gated Recurrent Convolutional Networks for Video-Based Pedestrian Re-Identification

摘要

Matching pedestrians across multiple camera views has attracted lots of recent research attention due to its apparent importance in surveillance and security applications.While most existing works address this problem in a still-image setting, we consider the more informative and challenging video-based person re-identification problem, where a video of a pedestrian as seen in one camera needs to be matched to a gallery of videos captured by other non-overlapping cameras. We employ a convolutional network to extract the appearance and motion features from raw video sequences, and then feed them into a multi-rate recurrent network to exploit the temporal correlations, and more importantly, to take into account the fact that pedestrians, sometimes even the same pedestrian, move in different speeds across different camera views. The combined network is trained in an end-to-end fashion, and we further propose an initialization strategy via context reconstruction to largely improve the performance. We conduct extensive experiments on the iLIDS-VID and PRID-2011 datasets, and our experimental results confirm the effectiveness and the generalization ability of our model.

关键词

Video-based person re-id; Motion Variances; LSTM

全文

PDF


Cross-View Person Identification by Matching Human Poses Estimated With Confidence on Each Body Joint

摘要

Cross-view person identification (CVPI) from multiple temporally synchronized videos taken by multiple wearable cameras from different, varying views is a very challenging but important problem, which has attracted more interests recently. Current state-of-the-art performance of CVPI is achieved by matching appearance and motion features across videos, while the matching of pose features does not work effectively given the high inaccuracy of the 3D human pose estimation on videos/images collected in the wild. In this paper, we introduce a new metric of confidence to the 3D human pose estimation and show that the combination of the inaccurately estimated human pose and the inferred confidence metric can be used to boost the CVPI performance---the estimated pose information can be integrated to the appearance and motion features to achieve the new state-of-the-art CVPI performance. More specifically, the estimated confidence metric is measured at each human-body joint and the joints with higher confidence are weighted more in the pose matching for CVPI. In the experiments, we validate the proposed method on three wearable-camera video datasets and compare the performance against several other existing CVPI methods.

关键词

Cross-view Person Identification; Confidence Metric; Human Pose Estimation

全文

PDF


Visual Relationship Detection With Deep Structural Ranking

摘要

Visual relationship detection aims to describe the interactions between pairs of objects. Different from individual object learning tasks, the number of possible relationships are much larger, which makes it hard to explore only based on the visual appearance of objects. In addition, due to the limited human effort, the annotations for visual relationships are usually incomplete which increases the difficulty of model training and evaluation. In this paper, we propose a novel framework, called Deep Structural Ranking, for visual relationship detection. To complement the representation ability of visual appearance, we integrate multiple cues for predicting the relationships contained in an input image. Moreover, we design a new ranking objective function by enforcing the annotated relationships to have higher relevance scores. Unlike previous works, our proposed method can both facilitate the co-occurrence of relationships and mitigate the incompleteness problem. Experimental results show that our proposed method outperforms the state-of-the-art on the two widely used datasets. We also demonstrate its superiority in detecting zero-shot relationships.

关键词

Visual relationship detection; Image understanding

全文

PDF


Tracking Occluded Objects and Recovering Incomplete Trajectories by Reasoning About Containment Relations and Human Actions

摘要

This paper studies a challenging problem of tracking severely occluded objects in long video sequences. The proposed method reasons about the containment relations and human actions, thus infers and recovers occluded objects identities while contained or blocked by others. There are two conditions that lead to incomplete trajectories: i) Contained. The occlusion is caused by a containment relation formed between two objects, e.g., an unobserved laptop inside a backpack forms containment relation between the laptop and the backpack. ii) Blocked. The occlusion is caused by other objects blocking the view from certain locations, during which the containment relation does not change. By explicitly distinguishing these two causes of occlusions, the proposed algorithm formulates tracking problem as a network flow representation encoding containment relations and their changes. By assuming all the occlusions are not spontaneously happened but only triggered by human actions, an MAP inference is applied to jointly interpret the trajectory of an object by detection in space and human actions in time. To quantitatively evaluate our algorithm, we collect a new occluded object dataset captured by Kinect sensor, including a set of RGB-D videos and human skeletons with multiple actors, various objects, and different changes of containment relations. In the experiments, we show that the proposed method demonstrates better performance on tracking occluded objects compared with baseline methods.

关键词

spatial-temporal reasoning; containment; multi-object tracking

全文

PDF


Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction

摘要

Conventional methods of 3D object generative modeling learn volumetric predictions using deep networks with 3D convolutional operations, which are direct analogies to classical 2D ones. However, these methods are computationally wasteful in attempt to predict 3D shapes, where information is rich only on the surfaces. In this paper, we propose a novel 3D generative modeling framework to efficiently generate object shapes in the form of dense point clouds. We use 2D convolutional operations to predict the 3D structure from multiple viewpoints and jointly apply geometric reasoning with 2D projection optimization. We introduce the pseudo-renderer, a differentiable module to approximate the true rendering operation, to synthesize novel depth maps for optimization. Experimental results for single-image 3D object reconstruction tasks show that we outperforms state-of-the-art methods in terms of shape similarity and prediction density.

关键词

3D reconstruction; point cloud; generative modeling;

全文

PDF


Multi-Scale Face Restoration With Sequential Gating Ensemble Network

摘要

Restoring face images from distortions is important in face recognition applications and is challenged by multiple scale issues, which is still not well-solved in research area. In this paper, we present a Sequential Gating Ensemble Network (SGEN) for multi-scale face restoration issue. We first employ the principle of ensemble learning into SGEN architecture design to reinforce predictive performance of the network. The SGEN aggregates multi-level base-encoders and base-decoders into the network, which enables the network to contain multiple scales of receptive field. Instead of combining these base-en/decoders directly with non-sequential operations, the SGEN takes base-en/decoders from different levels as sequential data. Specifically, the SGEN learns to sequentially extract high level information from base-encoders in bottom-up manner and restore low level information from base-decoders in top-down manner. Besides, we propose to realize bottom-up and top-down information combination and selection with Sequential Gating Unit (SGU). The SGU sequentially takes two inputs from different levels and decides the output based on one active input. Experiment results demonstrate that our SGEN is more effective at multi-scale human face restoration with more image details and less noise than state-of-the-art image restoration models. By using adversarial training, SGEN also produces more visually preferred results than other models through subjective evaluation.

关键词

全文

PDF


Action Recognition With Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion

摘要

Action recognition is an important yet challenging task in computer vision. In this paper, we propose a novel deep-based framework for action recognition, which improves the recognition accuracy by: 1) deriving more precise features for representing actions, and 2) reducing the asynchrony between different information streams. We first introduce a coarse-to-fine network which extracts shared deep features at different action class granularities and progressively integrates them to obtain a more accurate feature representation for input actions. We further introduce an asynchronous fusion network. It fuses information from different streams by asynchronously integrating stream-wise features at different time points, hence better leveraging the complementary information in different streams. Experimental results on action recognition benchmarks demonstrate that our approach achieves the state-of-the-art performance.

关键词

Action Recognition; Action Granularity; Asychronuous Fusion

全文

PDF


T-C3D: Temporal Convolutional 3D Network for Real-Time Action Recognition

摘要

Video-based action recognition with deep neural networks has shown remarkable progress. However, most of the existing approaches are too computationally expensive due to the complex network architecture. To address these problems, we propose a new real-time action recognition architecture, called Temporal Convolutional 3D Network (T-C3D), which learns video action representations in a hierarchical multi-granularity manner. Specifically, we combine a residual 3D convolutional neural network which captures complementary information on the appearance of a single frame and the motion between consecutive frames with a new temporal encoding method to explore the temporal dynamics of the whole video. Thus heavy calculations are avoided when doing the inference, which enables the method to be capable of real-time processing. On two challenging benchmark datasets, UCF101 and HMDB51, our method is significantly better than state-of-the-art real-time methods by over 5.4% in terms of accuracy and 2 times faster in terms of inference speed (969 frames per second), demonstrating comparable recognition performance to the state-of-the-art methods. The source code for the complete system as well as the pre-trained models are publicly available at https://github.com/tc3d.

关键词

全文

PDF


Cross-Domain Human Parsing via Adversarial Feature and Label Adaptation

摘要

Human parsing has been extensively studied recently due to its wide applications in many important scenarios. Mainstream fashion parsing models (i.e., parsers) focus on parsing the high-resolution and clean images. However, directly applying the parsers trained on benchmarks of high-quality samples to a particular application scenario in the wild, e.g., a canteen, airport or workplace, often gives non-satisfactory performance due to domain shift. In this paper, we explore a new and challenging cross-domain human parsing problem: taking the benchmark dataset with extensive pixel-wise labeling as the source domain, how to obtain a satisfactory parser on a new target domain without requiring any additional manual labeling? To this end, we propose a novel and efficient cross-domain human parsing model to bridge the cross-domain differences in terms of visual appearance and environment conditions and fully exploit commonalities across domains. Our proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain differences. A discriminative feature adversarial network is introduced to supervise the feature compensation to effectively reduces the discrepancy between feature distributions of two domains. Besides, our proposed model also introduces a structured label adversarial network to guide the parsing results of the target domain to follow the high-order relationships of the structured labels shared across domains. The proposed framework is end-to-end trainable, practical and scalable in real applications. Extensive experiments are conducted where LIP dataset is the source domain and 4 different datasets including surveillance videos, movies and runway shows without any annotations, are evaluated as target domains. The results consistently confirm data efficiency and performance advantages of the proposed method for the challenging cross-domain human parsing problem.

关键词

semantic segmentation; cross domain

全文

PDF


Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition

摘要

In this paper, we present a Character-Aware Neural Network (Char-Net) for recognizing distorted scene text. Our Char-Net is composed of a word-level encoder, a character-level encoder, and a LSTM-based decoder. Unlike previous work which employed a global spatial transformer network to rectify the entire distorted text image, we take an approach of detecting and rectifying individual characters. To this end, we introduce a novel hierarchical attention mechanism (HAM) which consists of a recurrent RoIWarp layer and a character-level attention layer. The recurrent RoIWarp layer sequentially extracts a feature region corresponding to a character from the feature map produced by the word-level encoder, and feeds it to the character-level encoder which removes the distortion of the character through a simple spatial transformer and further encodes the character region. The character-level attention layer then attends to the most relevant features of the feature map produced by the character-level encoder and composes a context vector, which is finally fed to the LSTM-based decoder for decoding. This approach of adopting a simple local transformation to model the distortion of individual characters not only results in an improved efficiency, but can also handle different types of distortion that are hard, if not impossible, to be modelled by a single global transformation. Experiments have been conducted on six public benchmark datasets. Our results show that Char-Net can achieve state-of-the-art performance on all the benchmarks, especially on the IC-IST which contains scene text with large distortion. Code will be made available.

关键词

Text Recognition; Attention Mechanism; RNN

全文

PDF


Semi-Supervised Bayesian Attribute Learning for Person Re-Identification

摘要

Person re-identification (re-ID) tasks aim to identify the same person in multiple images captured from non-overlapping camera views. Most previous re-ID studies have attempted to solve this problem through either representation learning or metric learning, or by combining both techniques. Representation learning relies on the latent factors or attributes of the data. In most of these works, the dimensionality of the factors/attributes has to be manually determined for each new dataset. Thus, this approach is not robust. Metric learning optimizes a metric across the dataset to measure similarity according to distance. However, choosing the optimal method for computing these distances is data dependent, and learning the appropriate metric relies on a sufficient number of pair-wise labels. To overcome these limitations, we propose a novel algorithm for person re-ID, called semi-supervised Bayesian attribute learning. We introduce an Indian Buffet Process to identify the priors of the latent attributes. The dimensionality of attributes factors is then automatically determined by nonparametric Bayesian learning. Meanwhile, unlike traditional distance metric learning, we propose a re-identification probability distribution to describe how likely it is that a pair of images contains the same person. This technique relies solely on the latent attributes of both images. Moreover, pair-wise labels that are not known can be estimated from pair-wise labels that are known, making this a robust approach for semi-supervised learning. Extensive experiments demonstrate the superior performance of our algorithm over several state-of-the-art algorithms on small-scale datasets and comparable performance on large-scale re-ID datasets.

关键词

全文

PDF


A Cascaded Inception of Inception Network With Attention Modulated Feature Fusion for Human Pose Estimation

摘要

Accurate keypoint localization of human pose needs diversified features: the high level for contextual dependencies and the low level for detailed refinement of joints. However, the importance of the two factors varies from case to case, but how to efficiently use the features is still an open problem. Existing methods have limitations in preserving low level features, adaptively adjusting the importance of different levels of features, and modeling the human perception process. This paper presents three novel techniques step by step to efficiently utilize different levels of features for human pose estimation. Firstly, an inception of inception (IOI) block is designed to emphasize the low level features. Secondly, an attention mechanism is proposed to adjust the importance of individual levels according to the context. Thirdly, a cascaded network is proposed to sequentially localize the joints to enforce message passing from joints of stand-alone parts like head and torso to remote joints like wrist or ankle. Experimental results demonstrate that the proposed method achieves the state-of-the-art performance on both MPII and LSP benchmarks.

关键词

Inception of Inception; Cascade Joint Network

全文

PDF


Dictionary Learning Inspired Deep Network for Scene Recognition

摘要

Scene recognition remains one of the most challenging problems in image understanding. With the help of fully connected layers (FCL) and rectified linear units (ReLu), deep networks can extract the moderately sparse and discriminative feature representation required for scene recognition. However, few methods consider exploiting a sparsity model for learning the feature representation in order to provide enhanced discriminative capability. In this paper, we replace the conventional FCL and ReLu with a new dictionary learning layer, that is composed of a finite number of recurrent units to simultaneously enhance the sparse representation and discriminative abilities of features via the determination of optimal dictionaries. In addition, with the help of the structure of the dictionary, we propose a new label discriminative regressor to boost the discrimination ability. We also propose new constraints to prevent overfitting by incorporating the advantage of the Mahalanobis and Euclidean distances to balance the recognition accuracy and generalization performance. Our proposed approach is evaluated using various scene datasets and shows superior performance to many state-of-the-art approaches.

关键词

全文

PDF


PoseHD: Boosting Human Detectors Using Human Pose Information

摘要

As most recently proposed methods for human detection have achieved a sufficiently high recall rate within a reasonable number of proposals, in this paper, we mainly focus on how to improve the precision rate of human detectors. In order to address the two main challenges in precision improvement, i.e., i) hard background instances and ii) redundant partial proposals, we propose the novel PoseHD framework, a top-down pose-based approach on the basis of an arbitrary state-of-the-art human detector. In our proposed PoseHD framework, we first make use of human pose estimation (in a batch manner) and present pose heatmap classification (by a convolutional neural network) to eliminate hard negatives by extracting the more detailed structural information; then, we utilize pose-based proposal clustering and reranking modules, filtering redundant partial proposals by comprehensively considering both holistic and part information. The experimental results on multiple pedestrian benchmark datasets validate that our proposed PoseHD framework can generally improve the overall performance of recent state-of-the-art human detectors (by 2-4% in both mAP and MR metrics). Moreover, our PoseHD framework can be easily extended to object detection with large-scale object part annotations. Finally, in this paper, we present extensive ablative analysis to compare our approach with these traditional bottom-up pose-based models and highlight the importance of our framework design decisions.

关键词

全文

PDF


SqueezedText: A Real-Time Scene Text Recognition by Binary Convolutional Encoder-Decoder Network

摘要

A new approach for real-time scene text recognition is proposed in this paper. A novel binary convolutional encoder-decoder network (B-CEDNet) together with a bidirectional recurrent neural network (Bi-RNN). The B-CEDNet is engaged as a visual front-end to provide elaborated character detection, and a back-end Bi-RNN performs character-level sequential correction and classification based on learned contextual knowledge. The front-end B-CEDNet can process multiple regions containing characters using a one-off forward operation, and is trained under binary constraints with significant compression. Hence it leads to both remarkable inference run-time speedup as well as memory usage reduction. With the elaborated character detection, the back-end Bi-RNN merely processes a low dimension feature sequence with category and spatial information of extracted characters for sequence correction and classification. By training with over 1,000,000 synthetic scene text images, the B-CEDNet achieves a recall rate of 0.86, precision of 0.88 and F-score of 0.87 on ICDAR-03 and ICDAR-13. With the correction and classification by Bi-RNN, the proposed real-time scene text recognition achieves state-of-the-art accuracy while only consumes less than 1-ms inference run-time. The flow processing flow is realized on GPU with a small network size of 1.01 MB for B-CEDNet and 3.23 MB for Bi-RNN, which is much faster and smaller than the existing solutions.

关键词

Binary Neural Network, Text Recognition

全文

PDF


Multimodal Keyless Attention Fusion for Video Classification

摘要

The problem of video classification is inherently sequential and multimodal, and deep neural models hence need to capture and aggregate the most pertinent signals for a given input video. We propose Keyless Attention as an elegant and efficient means to more effectively account for the sequential nature of the data. Moreover, comparing a variety of multimodal fusion methods, we find that Multimodal Keyless Attention Fusion is the most successful at discerning interactions between modalities. We experiment on four highly heterogeneous datasets, UCF101, ActivityNet, Kinetics, and YouTube-8M to validate our conclusion, and show that our approach achieves highly competitive results. Especially on large-scale data, our method has great advantages in efficiency and performance. Most remarkably, our best single model can achieve 77.0% in terms of the top-1 accuracy and 93.2% in terms of the top-5 accuracy on the Kinetics validation set, and achieve 82.2% in terms of GAP@20 on the official YouTube-8M test set.

关键词

Video Classification; Attention Mechanism

全文

PDF


Towards Affordable Semantic Searching: Zero-Shot Retrieval via Dominant Attributes

摘要

Instance-level retrieval has become an essential paradigm to index and retrieves images from large-scale databases. Conventional instance search requires at least an example of the query image to retrieve images that contain the same object instance. Existing semantic retrieval can only search semantically-related images, such as those sharing the same category or a set of tags, not the exact instances. Meanwhile, the unrealistic assumption is that all categories or tags are known beforehand. Training models for these semantic concepts highly rely on instance-level attributes or human captions which are expensive to acquire. Given the above challenges, this paper studies the Zero-shot Retrieval problem that aims for instance-level image search using only a few dominant attributes. The contributions are: 1) we utilise automatic word embedding to infer class-level attributes to circumvent expensive human labelling; 2) the inferred class-attributes can be extended into discriminative instance attributes through our proposed Latent Instance Attributes Discovery (LIAD) algorithm; 3) our method is not restricted to complete attribute signatures, query of dominant attributes can also be dealt with. On two benchmarks, CUB and SUN, extensive experiments demonstrate that our method can achieve promising performance for the problem. Moreover, our approach can also benefit conventional ZSL tasks.

关键词

Zero-shot Learning; Image Retrieval; Dominate Attributes

全文

PDF


Co-Attending Free-Form Regions and Detections With Multi-Modal Multiplicative Feature Embedding for Visual Question Answering

摘要

Recently, the Visual Question Answering (VQA) task has gained increasing attention in artificial intelligence. Existing VQA methods mainly adopt the visual attention mechanism to associate the input question with corresponding image regions for effective question answering. The free-form region based and the detection-based visual attention mechanisms are mostly investigated, with the former ones attending free-form image regions and the latter ones attending pre-specified detection-box regions. We argue that the two attention mechanisms are able to provide complementary information and should be effectively integrated to better solve the VQA problem. In this paper, we propose a novel deep neural network for VQA that integrates both attention mechanisms. Our proposed framework effectively fuses features from free-form image regions, detection boxes, and question representations via a multi-modal multiplicative feature embedding scheme to jointly attend question-related free-form image regions and detection boxes for more accurate question answering. The proposed method is extensively evaluated on two publicly available datasets, COCO-QA and VQA, and outperforms state-of-the-art approaches. Source code is available at https://github.com/lupantech/dual-mfa-vqa.

关键词

Visual Question Answering; Multi-modal Embedding; Co-attention Mechanism; Object Detection

全文

PDF


Unsupervised Articulated Skeleton Extraction From Point Set Sequences Captured by a Single Depth Camera

摘要

How to robustly and accurately extract articulated skeletons from point set sequences captured by a single consumer-grade depth camera still remains to be an unresolved challenge to date. To address this issue, we propose a novel, unsupervised approach consisting of three contributions (steps): (i) a non-rigid point set registration algorithm to first build one-to-one point correspondences among the frames of a sequence; (ii) a skeletal structure extraction algorithm to generate a skeleton with reasonable numbers of joints and bones; (iii) a skeleton joints estimation algorithm to achieve accurate joints. At the end, our method can produce a quality articulated skeleton from a single 3D point sequence corrupted with noise and outliers. The experimental results show that our approach soundly outperforms state of the art techniques, in terms of both visual quality and accuracy.

关键词

unsupervised skeleton extraction; point set sequences; depth camera

全文

PDF


Curve-Structure Segmentation From Depth Maps: A CNN-Based Approach and Its Application to Exploring Cultural Heritage Objects

摘要

Motivated by the important archaeological application of exploring cultural heritage objects, in this paper we study the challenging problem of automatically segmenting curve structures that are very weakly stamped or carved on an object surface in the form of a highly noisy depth map. Different from most classical low-level image segmentation methods that are known to be very sensitive to the noise and occlusions, we propose a new supervised learning algorithm based on Convolutional Neural Network (CNN) to implicitly learn and utilize more curve geometry and pattern information for addressing this challenging problem. More specifically, we first propose a Fully Convolutional Network (FCN) to estimate the skeleton of curve structures and at each skeleton pixel, a scale value is estimated to reflect the local curve width. Then we propose a dense prediction network to refine the estimated curve skeletons. Based on the estimated scale values, we finally develop an adaptive thresholding algorithm to achieve the final segmentation of curve structures. In the experiment, we validate the performance of the proposed method on a dataset of depth images scanned from unearthed pottery shards dating to the Woodland period of Southeastern North America.

关键词

Cultural Heritage Object; Curve-Structure Segmentation; Convolutional Neural Network

全文

PDF


Multi-Channel Pyramid Person Matching Network for Person Re-Identification

摘要

In this work, we present a Multi-Channel deep convolutional Pyramid Person Matching Network (MC-PPMN) based on the combination of the semantic-components and the color-texture distributions to address the problem of person re-identification. In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid person matching network (PPMN) to obtain correspondence representations. These correspondence representations are fused to perform the re-identification task. Further, the proposed framework is optimized via a unified end-to-end deep learning scheme. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our approach against the state-of-the-art literature, especially on the rank-1 recognition rate.

关键词

全文

PDF


UnFlow: Unsupervised Learning of Optical Flow With a Bidirectional Census Loss

摘要

In the era of end-to-end deep learning, many advances in computer vision are driven by large amounts of labeled data. In the optical flow setting, however, obtaining dense per-pixel ground truth for real scenes is difficult and thus such data is rare. Therefore, recent end-to-end convolutional networks for optical flow rely on synthetic datasets for supervision, but the domain mismatch between training and test scenarios continues to be a challenge. Inspired by classical energy-based optical flow methods, we design an unsupervised loss based on occlusion-aware bidirectional flow estimation and the robust census transform to circumvent the need for ground truth flow. On the KITTI benchmarks, our unsupervised approach outperforms previous unsupervised deep networks by a large margin, and is even more accurate than similar supervised methods trained on synthetic datasets alone. By optionally fine-tuning on the KITTI training data, our method achieves competitive optical flow accuracy on the KITTI 2012 and 2015 benchmarks, thus in addition enabling generic pre-training of supervised networks for datasets with limited amounts of ground truth.

关键词

Motion; Deep Learning/Neural Networks; Unsupervised Learning

全文

PDF


Weakly Supervised Collective Feature Learning From Curated Media

摘要

The current state-of-the-art in feature learning relies on the supervised learning of large-scale datasets consisting of target content items and their respective category labels. However, constructing such large-scale fully-labeled datasets generally requires painstaking manual effort. One possible solution to this problem is to employ community contributed text tags as weak labels, however, the concepts underlying a single text tag strongly depends on the users. We instead present a new paradigm for learning discriminative features by making full use of the human curation process on social networking services (SNSs). During the process of content curation, SNS users collect content items manually from various sources and group them by context, all for their own benefit. Due to the nature of this process, we can assume that (1) content items in the same group share the same semantic concept and (2) groups sharing the same images might have related semantic concepts. Through these insights, we can define human curated groups as weak labels from which our proposed framework can learn discriminative features as a representation in the space of semantic concepts the users intended when creating the groups. We show that this feature learning can be formulated as a problem of link prediction for a bipartite graph whose nodes corresponds to content items and human curated groups, and propose a novel method for feature learning based on sparse coding or network fine-tuning.

关键词

weakly supervised learning; feature learning; content curation; social media; link prediction

全文

PDF


Asking Friendly Strangers: Non-Semantic Attribute Transfer

摘要

Attributes can be used to recognize unseen objects from a textual description. Their learning is oftentimes accomplished with a large amount of annotations, e.g. around 160k-180k, but what happens if for a given attribute, we do not have many annotations? The standard approach would be to perform transfer learning, where we use source models trained on other attributes, to learn a separate target attribute. However existing approaches only consider transfer from attributes in the same domain i.e. they perform semantic transfer between attributes that have related meaning. Instead, we propose to perform non-semantic transfer from attributes that may be in different domains, hence they have no semantic relation to the target attributes. We develop an attention-guided transfer architecture that learns how to weigh the available source attribute classifiers, and applies them to image features for the attribute name of interest, to make predictions for that attribute. We validate our approach on 272 attributes from five domains: animals, objects, scenes, shoes and textures. We show that semantically unrelated attributes provide knowledge that helps improve the accuracy of the target attribute of interest, more so than only allowing transfer from semantically related attributes.

关键词

deep learning; attribute learning; transfer learning; attention

全文

PDF


Spatial as Deep: Spatial CNN for Traffic Scene Understanding

摘要

Convolutional neural networks (CNNs) are usually built by stacking convolutional operations layer-by-layer. Although CNN has shown strong capability to extract semantics from raw pixels, its capacity to capture spatial relationships of pixels across rows and columns of an image is not fully explored. These relationships are important to learn semantic objects with strong shape priors but weak appearance coherences, such as traffic lanes, which are often occluded or not even painted on the road surface as shown in Fig. 1 (a). In this paper, we propose Spatial CNN (SCNN), which generalizes traditional deep layer-by-layer convolutions to slice-by-slice convolutions within feature maps, thus enabling message passings between pixels across rows and columns in a layer. Such SCNN is particular suitable for long continuous shape structure or large objects, with strong spatial relationship but less appearance clues, such as traffic lanes, poles, and wall. We apply SCNN on a newly released very challenging traffic lane detection dataset and Cityscapse dataset. The results show that SCNN could learn the spatial relationship for structure output and significantly improves the performance. We show that SCNN outperforms the recurrent neural network (RNN) based ReNet and MRF+CNN (MRFNet) in the lane detection dataset by 8.7% and 4.6% respectively. Moreover, our SCNN won the 1st place on the TuSimple Benchmark Lane Detection Challenge, with an accuracy of 96.53%.

关键词

Scene Understanding; Lane Detection; Spatial CNN

全文

PDF


Adaptive Feature Abstraction for Translating Video to Text

摘要

Previous models for video captioning often use the output from a specific layer of a Convolutional Neural Network (CNN) as video features. However, the variable context-dependent semantics in the video may make it more appropriate to adaptively select features from the multiple CNN layers. We propose a new approach for generating adaptive spatiotemporal representations of videos for the captioning task. A novel attention mechanism is developed, that adaptively and sequentially focuses on different layers of CNN features (levels of feature "abstraction"), as well as local spatiotemporal regions of the feature maps at each layer. The proposed approach is evaluated on three benchmark datasets: YouTube2Text, M-VAD and MSR-VTT. Along with visualizing the results and how the model works, these experiments quantitatively demonstrate the effectiveness of the proposed adaptive spatiotemporal feature abstraction for translating videos to sentences with rich semantics.

关键词

video captioning; attention; adaptive feature representation

全文

PDF


Scene-Centric Joint Parsing of Cross-View Videos

摘要

Cross-view video understanding is an important yet under-explored area in computer vision. In this paper, we introduce a joint parsing framework that integrates view-centric proposals into scene-centric parse graphs that represent a coherent scene-centric understanding of cross-view scenes. Our key observations are that overlapping fields of views embed rich appearance and geometry correlations and that knowledge fragments corresponding to individual vision tasks are governed by consistency constraints available in commonsense knowledge. The proposed joint parsing framework represents such correlations and constraints explicitly and generates semantic scene-centric parse graphs. Quantitative experiments show that scene-centric predictions in the parse graph outperform view-centric predictions.

关键词

Joint Parsing; Scene-centric Representation; Knowledge Fusion; Spatio-temporal Semantic Parse Graph; Ontology Graph

全文

PDF


Exploring Human-Like Attention Supervision in Visual Question Answering

摘要

Attention mechanisms have been widely applied in the Visual Question Answering (VQA) task, as they help to focus on the area-of-interest of both visual and textual information. To answer the questions correctly, the model needs to selectively target different areas of an image, which suggests that an attention-based model may benefit from an explicit attention supervision. In this work, we aim to address the problem of adding attention supervision to VQA models. Since there is a lack of human attention data, we first propose a Human Attention Network (HAN) to generate human-like attention maps, training on a recently released dataset called Human ATtention Dataset (VQA-HAT). Then, we apply the pre-trained HAN on the VQA v2.0 dataset to automatically produce the human-like attention maps for all image-question pairs. The generated human-like attention map dataset for the VQA v2.0 dataset is named as Human-Like ATtention (HLAT) dataset. Finally, we apply human-like attention supervision to an attention-based VQA model. The experiments show that adding human-like supervision yields a more accurate attention together with a better performance, showing a promising future for human-like attention supervision in VQA.

关键词

Computer Vision; Visual Question Answering

全文

PDF


RAN4IQA: Restorative Adversarial Nets for No-Reference Image Quality Assessment

摘要

Inspired by the free-energy brain theory, which implies that human visual system (HVS) tends to reduce uncertainty and restore perceptual details upon seeing a distorted image, we propose restorative adversarial net (RAN), a GAN-based model for no-reference image quality assessment (NR-IQA). RAN, which mimics the process of HVS, consists of three components: a restorator, a discriminator and an evaluator. The restorator restores and reconstructs input distorted image patches, while the discriminator distinguishes the reconstructed patches from the pristine distortion-free patches. After restoration, we observe that the perceptual distance between the restored and the distorted patches is monotonic with respect to the distortion level. We further define Gain of Restoration (GoR) based on this phenomenon. The evaluator predicts perceptual score by extracting feature representations from the distorted and restored patches to measure GoR. Eventually, the quality score of an input image is estimated by weighted sum of the patch scores. Experimental results on Waterloo Exploration, LIVE and TID2013 show the effectiveness and generalization ability of RAN compared to the state-of-the-art NR-IQA models.

关键词

Image Quality Assessment; Generative Adversarial Nets; Human Visual System

全文

PDF


Extreme Low Resolution Activity Recognition With Multi-Siamese Embedding Learning

摘要

This paper presents an approach for recognizing human activities from extreme low resolution (e.g., 16x12) videos. Extreme low resolution recognition is not only necessary for analyzing actions at a distance but also is crucial for enabling privacy-preserving recognition of human activities. We design a new two-stream multi-Siamese convolutional neural network. The idea is to explicitly capture the inherent property of low resolution (LR) videos that two images originated from the exact same scene often have totally different pixel values depending on their LR transformations. Our approach learns the shared embedding space that maps LR videos with the same content to the same location regardless of their transformations. We experimentally confirm that our approach of jointly learning such transform robust LR video representation and the classifier outperforms the previous state-of-the-art low resolution recognition approaches on two public standard datasets by a meaningful margin.

关键词

human activity recognition; extreme low resolution videos; privacy-preserving recognition

全文

PDF


Top-Down Feedback for Crowd Counting Convolutional Neural Network

摘要

Counting people in dense crowds is a demanding task even for humans. This is primarily due to the large variability in appearance of people. Often people are only seen as a bunch of blobs. Occlusions, pose variations and background clutter further compound the difficulty. In this scenario, identifying a person requires larger spatial context and semantics of the scene. But the current state-of-the-art CNN regressors for crowd counting are feedforward and use only limited spatial context to detect people. They look for local crowd patterns to regress the crowd density map, resulting in false predictions. Hence, we propose top-down feedback to correct the initial prediction of the CNN. Our architecture consists of a bottom-up CNN along with a separate top-down CNN to generate feedback. The bottom-up network, which regresses the crowd density map, has two columns of CNN with different receptive fields. Features from various layers of the bottom-up CNN are fed to the top-down network. The feedback, thus generated, is applied on the lower layers of the bottom-up network in the form of multiplicative gating. This masking weighs activations of the bottom-up network at spatial as well as feature levels to correct the density prediction. We evaluate the performance of our model on all major crowd datasets and show the effectiveness of top-down feedback.

关键词

CNN; Crowd Counting; Scene Understanding

全文

PDF


Game of Sketches: Deep Recurrent Models of Pictionary-Style Word Guessing

摘要

The ability of machine-based agents to play games in human-like fashion is considered a benchmark of progress in AI. In this paper, we introduce the first computational model aimed at Pictionary, the popular word-guessing social game. We first introduce Sketch-QA, an elementary version of Visual Question Answering task. Styled after Pictionary, Sketch-QA uses incrementally accumulated sketch stroke sequences as visual data. Notably, Sketch-QA involves asking a fixed question ("What object is being drawn?") and gathering open-ended guess-words from human guessers. To mimic Pictionary-style guessing, we propose a deep neural model which generates guess-words in response to temporally evolving human-drawn sketches. Our model even makes human-like mistakes while guessing, thus amplifying the human mimicry factor. We evaluate our model on the large-scale guess-word dataset generated via Sketch-QA task and compare with various baselines. We also conduct a Visual Turing Test to obtain human impressions of the guess-words generated by humans and our model. Experimental results demonstrate the promise of our approach for Pictionary and similarly themed games.

关键词

Pictionary; Sketches; AI; Deep Learning

全文

PDF


DLPaper2Code: Auto-Generation of Code From Deep Learning Research Papers

摘要

With an abundance of research papers in deep learning, reproducibility or adoption of the existing works becomes a challenge. This is due to the lack of open source implementations provided by the authors. Even if the source code is available, then re-implementing research papers in a different library is a daunting task. To address these challenges, we propose a novel extensible approach, DLPaper2Code, to extract and understand deep learning design flow diagrams and tables available in a research paper and convert them to an abstract computational graph. The extracted computational graph is then converted into execution ready source code in both Keras and Caffe, in real-time. An arXiv-like website is created where the automatically generated designs is made publicly available for 5,000 research papers. The generated designs could be rated and edited using an intuitive drag-and-drop UI framework in a crowd sourced manner. To evaluate our approach, we create a simulated dataset with over 216,000 valid deep learning design flow diagrams using a manually defined grammar. Experiments on the simulated dataset show that the proposed framework provide more than 93% accuracy in flow diagram content extraction.

关键词

Deep Learning; Research Paper Mining; DARVIZ

全文

PDF


Region-Based Quality Estimation Network for Large-Scale Person Re-Identification

摘要

One of the major restrictions on the performance of video-based person re-id is partial noise caused by occlusion, blur and illumination. Since different spatial regions of a single frame have various quality, and the quality of the same region also varies across frames in a tracklet, a good way to address the problem is to effectively aggregate complementary information from all frames in a sequence, using better regions from other frames to compensate the influence of an image region with poor quality. To achieve this, we propose a novel Region-based Quality Estimation Network (RQEN), in which an ingenious training mechanism enables the effective learning to extract the complementary region-based information between different frames. Compared with other feature extraction methods, we achieved comparable results of 92.4%, 76.1% and 77.83% on the PRID 2011, iLIDS-VID and MARS, respectively. In addition, to alleviate the lack of clean large-scale person re-id datasets for the community, this paper also contributes a new high-quality dataset, named "Labeled Pedestrian in the Wild (LPW)" which contains 7,694 tracklets with over 590,000 images. Despite its relatively large scale, the annotations also possess high cleanliness. Moreover, it's more challenging in the following aspects: the age of characters varies from childhood to elderhood; the postures of people are diverse, including running and cycling in addition to the normal walking state.

关键词

全文

PDF


Adversarial Discriminative Heterogeneous Face Recognition

摘要

The gap between sensing patterns of different face modalities remains a challenging problem in heterogeneous face recognition (HFR). This paper proposes an adversarial discriminative feature learning framework to close the sensing gap via adversarial learning on both raw-pixel space and compact feature space. This framework integrates cross-spectral face hallucination and discriminative feature learning into an end-to-end adversarial network. In the pixel space, we make use of generative adversarial networks to perform cross-spectral face hallucination. An elaborate two-path model is introduced to alleviate the lack of paired images, which gives consideration to both global structures and local textures. In the feature space, an adversarial loss and a high-order variance discrepancy loss are employed to measure the global and local discrepancy between two heterogeneous distributions respectively. These two losses enhance domain-invariant feature learning and modality independent noise removing. Experimental results on three NIR-VIS databases show that our proposed approach outperforms state-of-the-art HFR methods, without requiring of complex network or large-scale training dataset.

关键词

全文

PDF


Learning Binary Residual Representations for Domain-Specific Video Streaming

摘要

We study domain-specific video streaming. Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission. Several popular video streaming services, such as the video game streaming services of GeForce Now and Twitch, fall in this category. While conventional video compression standards such as H.264 are commonly used for this task, we hypothesize that one can leverage the property that the videos are all in the same domain to achieve better video quality. Based on this hypothesis, we propose a novel video compression pipeline. Specifically, we first apply H.264 to compress domain-specific videos. We then train a novel binary autoencoder to encode the leftover domain-specific residual information frame-by-frame into binary representations. These binary representations are then compressed and sent to the client together with the H.264 stream. In our experiments, we show that our pipeline yields consistent gains over standard H.264 compression across several benchmark datasets while using the same channel bandwidth.

关键词

Video streaming; Binary representation; Residual autoencoder

全文

PDF


Diverse Beam Search for Improved Description of Complex Scenes

摘要

A single image captures the appearance and position of multiple entities in a scene as well as their complex interactions. As a consequence, natural language grounded in visual contexts tends to be diverse---with utterances differing as focus shifts to specific objects, interactions, or levels of detail. Recently, neural sequence models such as RNNs and LSTMs have been employed to produce visually-grounded language. Beam Search, the standard work-horse for decoding sequences from these models, is an approximate inference algorithm that decodes the top-B sequences in a greedy left-to-right fashion. In practice, the resulting sequences are often minor rewordings of a common utterance, failing to capture the multimodal nature of source images. To address this shortcoming, we propose Diverse Beam Search (DBS), a diversity promoting alternative to BS for approximate inference. DBS produces sequences that are significantly different from each other by incorporating diversity constraints within groups of candidate sequences during decoding; moreover, it achieves this with minimal computational or memory overhead. We demonstrate that our method improves both diversity and quality of decoded sequences over existing techniques on two visually-grounded language generation tasks---image captioning and visual question generation---particularly on complex scenes containing diverse visual content. We also show similar improvements at language-only machine translation tasks, highlighting the generality of our approach.

关键词

Recurrent Neural Networks, Beam Search, Diversity

全文

PDF


Movie Question Answering: Remembering the Textual Cues for Layered Visual Contents

摘要

Movies provide us with a mass of visual content as well as attracting stories. Existing methods have illustrated that understanding movie stories through only visual content is still a hard problem. In this paper, for answering questions about movies, we put forward a Layered Memory Network (LMN) that represents frame-level and clip-level movie content by the Static Word Memory module and the Dynamic Subtitle Memory module, respectively. Particularly, we firstly extract words and sentences from the training movie subtitles. Then the hierarchically formed movie representations, which are learned from LMN, not only encode the correspondence between words and visual content inside frames, but also encode the temporal alignment between sentences and frames inside movie clips. We also extend our LMN model into three variant frameworks to illustrate the good extendable capabilities. We conduct extensive experiments on the MovieQA dataset. With only visual content as inputs, LMN with frame-level representation obtains a large performance improvement. When incorporating subtitles into LMN to form the clip-level representation, we achieve the state-of-the-art performance on the online evaluation task of 'Video+Subtitles'. The good performance successfully demonstrates that the proposed framework of LMN is effective and the hierarchically formed movie representations have good potential for the applications of movie question answering.

关键词

movie question answering, layered memory network

全文

PDF


Supervised Deep Hashing for Hierarchical Labeled Data

摘要

Recently, hashing methods have been widely used in large-scale image retrieval. However, most existing supervised hashing methods do not consider the hierarchical relation of labels,which means that they ignored the rich semantic information stored in the hierarchy. Moreover, most of previous works treat each bit in a hash code equally, which does not meet the scenario of hierarchical labeled data. To tackle the aforementioned problems, in this paper, we propose a novel deep hashing method, called supervised hierarchical deep hashing (SHDH), to perform hash code learning for hierarchical labeled data. Specifically, we define a novel similarity formula for hierarchical labeled data by weighting each level, and design a deep neural network to obtain a hash code for each data point. Extensive experiments on two real-world public datasets show that the proposed method outperforms the state-of-the-art baselines in the image retrieval task.

关键词

Deep Hashing; Hierarchical Labeled Data; Supervised Learning

全文

PDF


Show, Reward and Tell: Automatic Generation of Narrative Paragraph From Photo Stream by Adversarial Training

摘要

Impressive image captioning results (i.e., an objective description for an image) are achieved with plenty of training pairs. In this paper, we take one step further to investigate the creation of narrative paragraph for a photo stream. This task is even more challenging due to the difficulty in modeling an ordered photo sequence and in generating a relevant paragraph with expressive language style for storytelling. The difficulty can even be exacerbated by the limited training data, so that existing approaches almost focus on search-based solutions. To deal with these challenges, we propose a sequence-to-sequence modeling approach with reinforcement learning and adversarial training. First, to model the ordered photo stream, we propose a hierarchical recurrent neural network as story generator, which is optimized by reinforcement learning with rewards. Second, to generate relevant and story-style paragraphs, we design the rewards with two critic networks, including a multi-modal and a language-style discriminator. Third, we further consider the story generator and reward critics as adversaries. The generator aims to create indistinguishable paragraphs to human-level stories, whereas the critics aim at distinguishing them and further improving the generator by policy gradient. Experiments on three widely-used datasets show the effectiveness, against state-of-the-art methods with relative increase of 20.2% by METEOR. We also show the subjective preference for the proposed approach over the baselines through a user study with 30 human subjects.

关键词

storytelling; reinforcement learning; adversarial training

全文

PDF


Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition

摘要

A novel deep neural network training paradigm that exploits the conjoint information in multiple heterogeneous sources is proposed. Specifically, in a RGB-D based action recognition task, it cooperatively trains a single convolutional neural network (named c-ConvNet) on both RGB visual features and depth features, and deeply aggregates the two kinds of features for action recognition. Differently from the conventional ConvNet that learns the deep separable features for homogeneous modality-based classification with only one softmax loss function, the c-ConvNet enhances the discriminative power of the deeply learned features and weakens the undesired modality discrepancy by jointly optimizing a ranking loss and a softmax loss for both homogeneous and heterogeneous modalities. The ranking loss consists of intra-modality and cross-modality triplet losses, and it reduces both the intra-modality and cross-modality feature variations. Furthermore, the correlations between RGB and depth data are embedded in the c-ConvNet, and can be retrieved by either of the modalities and contribute to the recognition in the case even only one of the modalities is available. The proposed method was extensively evaluated on two large RGB-D action recognition datasets, ChaLearn LAP IsoGD and NTU RGB+D datasets, and one small dataset, SYSU 3D HOI, and achieved state-of-the-art results.

关键词

Action Recognition; RGB-D Data; Deep Learning

全文

PDF


Temporal-Enhanced Convolutional Network for Person Re-Identification

摘要

We propose a new neural network called Temporal-enhanced Convolutional Network (T-CN) for video-based person reidentification. For each video sequence of a person, a spatial convolutional subnet is first applied to each frame for representing appearance information, and then a temporal convolutional subnet links small ranges of continuous frames to extract local motion information. Such spatial and temporal convolutions together construct our T-CN based representation. Finally, a recurrent network is utilized to further explore global dynamics, followed by temporal pooling to generate an overall feature vector for the whole sequence. In the training stage, a Siamese network architecture is adopted to jointly optimize all the components with losses covering both identification and verification. In the testing stage, our network generates an overall discriminative feature representation for each input video sequence (whose length may vary a lot) in a feed-forward way, and even a simple Euclidean distance based matching can generate good re-identification results. Experiments on the most widely used benchmark datasets demonstrate the superiority of our proposal, in comparison with the state-of-the-art.

关键词

Deep Neural Networks; Person Re-Identification

全文

PDF


Transferable Semi-Supervised Semantic Segmentation

摘要

The performance of deep learning based semantic segmentation models heavily depends on sufficient data with careful annotations. However, even the largest public datasets only provide samples with pixel-level annotations for rather limited semantic categories. Such data scarcity critically limits scalability and applicability of semantic segmentation models in real applications. In this paper, we propose a novel transferable semi-supervised semantic segmentation model that can transfer the learned segmentation knowledge from a few strong categories with pixel-level annotations to unseen weak categories with only image-level annotations, significantly broadening the applicable territory of deep segmentation models. In particular, the proposed model consists of two complementary and learnable components: a Label transfer Network (L-Net) and a Prediction transfer Network (P-Net). The L-Net learns to transfer the segmentation knowledge from strong categories to the images in the weak categories and produces coarse pixel-level semantic maps, by effectively exploiting the similar appearance shared across categories. Meanwhile, the P-Net tailors the transferred knowledge through a carefully designed adversarial learning strategy and produces refined segmentation results with better details. Integrating the L-Net and P-Net achieves 96.5% and 89.4% performance of the fully-supervised baseline using 50% and 0% categories with pixel-level annotations respectively on PASCAL VOC 2012. With such a novel transfer mechanism, our proposed model is easily generalizable to a variety of new categories, only requiring image-level annotations, and offers appealing scalability in real applications.

关键词

Semantic Segmentation; Semi-supervised learning; Neural Network

全文

PDF


Emphasizing 3D Properties in Recurrent Multi-View Aggregation for 3D Shape Retrieval

摘要

Multi-view based shape descriptors have achieved impressive performance for 3D shape retrieval. The core of view-based methods is to interpret 3D structures through 2D observations. However, most existing methods pay more attention to discriminative models and none of them necessarily incorporate the 3D properties of the objects. To resolve this problem, we propose an encoder-decoder recurrent feature aggregation network (ERFA-Net) to emphasize the 3D properties of 3D shapes in multi-view features aggregation. In our network, a view sequence of the shape is trained to encode a discriminative shape embedding and estimate unseen rendered views of any viewpoints. This generation task gives an effective supervision which makes the network exploit 3D properties of shapes through various 2D images. During feature aggregation, a discriminative feature representation across multiple views is effectively exploited based on LSTM network. The proposed 3D representation has following advantages against other state-of-the-art: 1) it performs robust discrimination under the existence of noise such as view missing and occlusion, because of the improvement brought by 3D properties. 2) it has strong generative capabilities, which is useful for various 3D shape tasks. We evaluate ERFA-Net on two popular 3D shape datasets, ModelNet and ShapeNetCore55, and ERFA-Net outperforms the state-of-the-art methods significantly. Extensive experiments show the effectiveness and robustness of the proposed 3D representation.

关键词

全文

PDF


Unsupervised Part-Based Weighting Aggregation of Deep Convolutional Features for Image Retrieval

摘要

In this paper, we propose a simple but effective semantic part-based weighting aggregation (PWA) for image retrieval. The proposed PWA utilizes the discriminative filters of deep convolutional layers as part detectors. Moreover, we propose the effective unsupervised strategy to select some part detectors to generate the "probabilistic proposals," which highlight certain discriminative parts of objects and suppress the noise of background. The final global PWA representation could then be acquired by aggregating the regional representations weighted by the selected "probabilistic proposals" corresponding to various semantic content. We conduct comprehensive experiments on four standard datasets and show that our unsupervised PWA outperforms the state-of-the-art unsupervised and supervised aggregation methods.

关键词

Part-based weighting aggregation; Probabilistic proposals; Image retrieval

全文

PDF


Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

摘要

Dynamics of human body skeletons convey significant information for human action recognition. Conventional approaches for modeling skeletons usually rely on hand-crafted parts or traversal rules, thus resulting in limited expressive power and difficulties of generalization. In this work, we propose a novel model of dynamic skeletons called Spatial-Temporal Graph Convolutional Networks (ST-GCN), which moves beyond the limitations of previous methods by automatically learning both the spatial and temporal patterns from data. This formulation not only leads to greater expressive power but also stronger generalization capability. On two large datasets, Kinetics and NTU-RGBD, it achieves substantial improvements over mainstream methods.

关键词

graph convolutional network; skeleton; action recognition

全文

PDF


Domain-Shared Group-Sparse Dictionary Learning for Unsupervised Domain Adaptation

摘要

Unsupervised domain adaptation has been proved to be a promising approach to solve the problem of dataset bias. To employ source labels in the target domain, it is required to align the joint distributions of source and target data. To do this, the key research problem is to align conditional distributions across domains without target labels. In this paper, we propose a new criterion of domain-shared group-sparsity that is an equivalent condition for conditional distribution alignment. To solve the problem in joint distribution alignment, a domain-shared group-sparse dictionary learning method is developed towards joint alignment of conditional and marginal distributions. A classifier for target domain is trained using the domain-shared group-sparse coefficients and the target-specific information from the target data. Experimental results on cross-domain face and object recognition show that the proposed method outperforms eight state-of-the-art unsupervised domain adaptation algorithms.

关键词

Domain Adaptation; Dictionary Learning

全文

PDF


Multi-Scale Bidirectional FCN for Object Skeleton Extraction

摘要

Object skeleton detection is a challenging problem with wide application. Recently, deep Convolutional Neural Networks (CNNs) have substantially improved the performance of the state-of-the-art in this task. However, most of the existing CNN-Based methods are based on a skip-layer structure where low-level and high-level features are combined and learned so as to gather multi-level contextual information. As shallow features are too messy and lack semantic knowledge, they may cause errors and inaccuracy. Therefore, we propose a novel network architecture, Multi-Scale Bidirectional Fully Convolutional Network (MSB-FCN), to better capture and consolidate multi-scale high-level context information for object skeleton detection. Our network uses only deep features to build multi-scale feature representations, and employs a bidirectional structure to collect contextual knowledge. Hence the proposed MSB-FCN has the ability to learn the semantic-level information from different sub-regions. Furthermore, we introduce dense connections into the bidirectional structure of our MSB-FCN to ensure that the learning process at each scale can directly encode information from all other scales. Extensive experiments on various commonly used benchmarks demonstrate that the proposed MSB-FCN has achieved significant improvements over the state-of-the-art algorithms.

关键词

Skeleton; fully convolutional network; bidirectional network

全文

PDF


Understanding Image Impressiveness Inspired by Instantaneous Human Perceptual Cues

摘要

With the explosion of visual information nowadays, millions of digital images are available to the users. How to efficiently explore a large set of images and retrieve useful information thus becomes extremely important. Unfortunately only some of the images can impress the user at first glance. Others that make little sense in human perception are often discarded, while still costing valuable time and space. Therefore, it is significant to identify these two kinds of images for relieving the load of online repositories and accelerating information retrieval process. However, most of the existing image properties, e.g., memorability and popularity, are based on repeated human interactions, which limit the research and application of evaluating image quality in terms of instantaneous impression. In this paper, we propose a novel image property, called impressiveness, that measures how images impress people with a short-term contact. This is based on an impression-driven model inspired by a number of important human perceptual cues. To achieve this, we first collect three datasets in various domains, which are labeled according to the instantaneous sensation of the annotators. Then we investigate the impressiveness property via six established human perceptual cues as well as the corresponding features from pixel to semantic levels. Sequentially, we verify the consistency of the impressiveness which can be quantitatively measured by multiple visual representations, and evaluate their latent relationships. Finally, we apply the proposed impressiveness property to rank the images for an efficient image recommendation system.

关键词

全文

PDF


Exploring Temporal Preservation Networks for Precise Temporal Action Localization

摘要

Temporal action localization is an important task of computer vision. Though a variety of methods have been proposed, it still remains an open question how to predict the temporal boundaries of action segments precisely. Most works use segment-level classifiers to select video segments pre-determined by action proposal or dense sliding windows. However, in order to achieve more precise action boundaries, a temporal localization system should make dense predictions at a fine granularity. A newly proposed work exploits Convolutional-Deconvolutional-Convolutional (CDC) filters to upsample the predictions of 3D ConvNets, making it possible to perform per-frame action predictions and achieving promising performance in terms of temporal action localization. However, CDC network loses temporal information partially due to the temporal downsampling operation. In this paper, we propose an elegant and powerful Temporal Preservation Convolutional (TPC) Network that equips 3D ConvNets with TPC filters. TPC network can fully preserve temporal resolution and downsample the spatial resolution simultaneously, enabling frame-level granularity action localization with minimal loss of time information. TPC network can be trained in an end-to-end manner. Experiment results on public datasets show that TPC network achieves significant improvement in both per-frame action prediction and segment-level temporal action localization.

关键词

全文

PDF


Towards Perceptual Image Dehazing by Physics-Based Disentanglement and Adversarial Training

摘要

Single image dehazing is a challenging under-constrained problem because of the ambiguities of unknown scene radiance and transmission. Previous methods solve this problem using various hand-designed priors or by supervised training on synthetic hazy image pairs. In practice, however, the predefined priors are easily violated and the paired image data is unavailable for supervised training. In this work, we propose Disentangled Dehazing Network, an end-to-end model that generates realistic haze-free images using only unpaired supervision. Our approach alleviates the paired training constraint by introducing a physical-model based disentanglement and reconstruction mechanism. A multi-scale adversarial training is employed to generate perceptually haze-free images. Experimental results on synthetic datasets demonstrate our superior performance compared with the state-of-the-art methods in terms of PSNR, SSIM and CIEDE2000. Through training on purely natural haze-free and hazy images from our collected HazyCity dataset, our model can generate more perceptually appealing dehazing results.

关键词

Image dehazing; Generative Adversarial Network

全文

PDF


Unsupervised Learning of Geometry From Videos With Edge-Aware Depth-Normal Consistency

摘要

Learning to reconstruct depths from a single image by watching unlabeled videos via deep convolutional network (DCN) is attracting significant attention in recent years, e.g. (Zhou et al. 2017). In this paper, we propose to use surface normal representation for unsupervised depth estimation framework. Our estimated depths are constrained to be compatible with predicted normals, yielding more robust geometry results. Specifically, we formulate an edge-aware depth-normal consistency term, and solve it by constructing a depth-to-normal layer and a normal-to-depth layer inside of the DCN. The depth-to-normal layer takes estimated depths as input, and computes normal directions using cross production based on neighboring pixels. Then given the estimated normals, the normal-to-depth layer outputs a regularized depth map through local planar smoothness. Both layers are computed with awareness of edges inside the image to help address the issue of depth/normal discontinuity and preserve sharp edges. Finally, to train the network, we apply the photometric error and gradient smoothness to supervise both depth and normal predictions. We conducted experiments on both outdoor (KITTI) and indoor (NYUv2) datasets, and showed that our algorithm vastly outperforms state-of-the-art, which demonstrates the benefits of our approach.

关键词

unsupervised learning;depth;3D geometry

全文

PDF


Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification

摘要

Person re-identification is widely studied in visible spectrum, where all the person images are captured by visible cameras. However, visible cameras may not capture valid appearance information under poor illumination conditions, e.g, at night. In this case, thermal camera is superior since it is less dependent on the lighting by using infrared light to capture the human body. To this end, this paper investigates a cross-modal re-identification problem, namely visible-thermal person re-identification (VT-REID). Existing cross-modal matching methods mainly focus on modeling the cross-modality discrepancy, while VT-REID also suffers from cross-view variations caused by different camera views. Therefore, we propose a hierarchical cross-modality matching model by jointly optimizing the modality-specific and modality-shared metrics. The modality-specific metrics transform two heterogenous modalities into a consistent space that modality-shared metric can be subsequently learnt. Meanwhile, the modality-specific metric compacts features of the same person within each modality to handle the large intra-modality intra-person variations (e.g. viewpoints, pose). Additionally, an improved two-stream CNN network is presented to learn the multi-modality sharable feature representations. Identity loss and contrastive loss are integrated to enhance the discriminability and modality-invariance with partially shared layer parameters. Extensive experiments illustrate the effectiveness and robustness of the proposed method.

关键词

Person Re-Identification;Cross Modality

全文

PDF


Co-Saliency Detection Within a Single Image

摘要

Recently, saliency detection in a single image and co-saliency detection in multiple images have drawn extensive research interest in the vision community. In this paper, we investigate a new problem of co-saliency detection within a single image, i.e., detecting within-image co-saliency. By identifying common saliency within an image, e.g., highlighting multiple occurrences of an object class with similar appearance, this work can benefit many important applications, such as the detection of objects of interest, more robust object recognition, reduction of information redundancy, and animation synthesis. We propose a new bottom-up method to address this problem. Specifically, a large number of object proposals are first detected from the image. Then we develop an optimization algorithm to derive a set of proposal groups, each of which contains multiple proposals showing good common saliency in the original image. For each proposal group, we calculate a co-saliency map and then use a low-rank based algorithm to fuse the maps calculated from all the proposal groups for the final co-saliency map in the image. In the experiment, we collect a new dataset of 364 color images with within-image cosaliency. Experiment results show that the proposed method can better detect the within-image co-saliency than existing algorithms.

关键词

Co-Saliency, optimization, low rank

全文

PDF


Deep Stereo Matching With Explicit Cost Aggregation Sub-Architecture

摘要

Deep neural networks have shown excellent performance for stereo matching. Many efforts focus on the feature extraction and similarity measurement of the matching cost computation step while less attention is paid on cost aggregation which is crucial for stereo matching. In this paper, we present a learning-based cost aggregation method for stereo matching by a novel sub-architecture in the end-to-end trainable pipeline. We reformulate the cost aggregation as a learning process of the generation and selection of cost aggregation proposals which indicate the possible cost aggregation results. The cost aggregation sub-architecture is realized by a two-stream network: one for the generation of cost aggregation proposals, the other for the selection of the proposals. The criterion for the selection is determined by the low-level structure information obtained from a light convolutional network. The two-stream network offers a global view guidance for the cost aggregation to rectify the mismatching value stemming from the limited view of the matching cost computation. The comprehensive experiments on challenge datasets such as KITTI and Scene Flow show that our method outperforms the state-of-the-art methods.

关键词

Stereo Matching; Cost Aggregation

全文

PDF


A Deep Ranking Model for Spatio-Temporal Highlight Detection From a 360◦ Video

摘要

We address the problem of highlight detection from a 360◦ video by summarizing it both spatially and temporally. Given a long 360◦ video, we spatially select pleasantly-looking normal field-of-view (NFOV) segments from unlimited field of views (FOV) of the 360◦ video, and temporally summarize it into a concise and informative highlight as a selected subset of subshots. We propose a novel deep ranking model named as Composition View Score (CVS) model, which produces a spherical score map of composition per video segment, and determines which view is suitable for highlight via a sliding window kernel at inference. To evaluate the proposed framework, we perform experiments on the Pano2Vid benchmark dataset (Su, Jayaraman, and Grauman 2016) and our newly collected 360◦ video highlight dataset from YouTube and Vimeo. Through evaluation using both quantitative summarization metrics and user studies via Amazon Mechanical Turk, we demonstrate that our approach outperforms several state-of-the-art highlight detection methods.We also show that our model is 16 times faster at inference than AutoCam (Su, Jayaraman, and Grauman 2016), which is one of the first summarization algorithms of 360◦ videos.

关键词

Automatic summarization of 360 degree videos ;Deep pairwise ranking models ;Weakly supervised large-scale web video training

全文

PDF


Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

摘要

Deep convolutional networks for semantic image segmentation typically require large-scale labeled data, e.g., ImageNet and MS COCO, for network pre-training. To reduce annotation efforts, self-supervised semantic segmentation is recently proposed to pre-train a network without any human-provided labels. The key of this new form of learning is to design a proxy task (e.g., image colorization), from which a discriminative loss can be formulated on unlabeled data. Many proxy tasks, however, lack the critical supervision signals that could induce discriminative representation for the target image segmentation task. Thus self-supervision’s performance is still far from that of supervised pre-training. In this study, we overcome this limitation by incorporating a "mix-and-match" (M&M) tuning stage in the self-supervision pipeline. The proposed approach is readily pluggable to many self-supervision methods and does not use more annotated samples than the original process. Yet, it is capable of boosting the performance of target image segmentation task to surpass fully-supervised pre-trained counterpart. The improvement is made possible by better harnessing the limited pixel-wise annotations in the target dataset. Specifically, we first introduce the "mix" stage, which sparsely samples and mixes patches from the target set to reflect rich and diverse local patch statistics of target images. A ‘match’ stage then forms a class-wise connected graph, which can be used to derive a strong triplet-based discriminative loss for finetuning the network. Our paradigm follows the standard practice in existing self-supervised studies and no extra data or label is required. With the proposed M&M approach, for the first time, a self-supervision method can achieve comparable or even better performance compared to its ImageNet pretrained counterpart on both PASCAL VOC2012 dataset and CityScapes dataset.

关键词

self-supervision; semantic segmentation; mix-and-match

全文

PDF


Audio Visual Attribute Discovery for Fine-Grained Object Recognition

摘要

Current progresses on fine-grained recognition are mainly focus on learning the discriminative feature representation via introducing the visual supervisions e.g. part labels. However, it is time-consuming and needs the professional knowledge to obtain the accuracy annotations. Different from these existing methods based on the visual supervisions, in this paper, we introduce a novel feature named audio visual attributes via discovering the correlations between the visual and audio representations. Specifically, our unified framework is training with video-level category label, which consists of two important modules, the encoder module and the attribute discovery module, to encode the image and audio into vectors and learn the correlations between audio and images, respectively. On the encoder module, we present two types of feed forward convolutional neural network for the image and audio modalities. While an attention driven framework based on recurrent neural network is developed to generate the audio visual attribute representation. Thus, our proposed architecture can be implemented end-to-end in the step of inference. We exploit our models for the problem of fine-grained bird recognition on the CUB200-211 benchmark. The experimental results demonstrate that with the help of audio visual attribute, we achieve the superior or comparable performance to that of strongly supervised approaches on the bird recognition.

关键词

全文

PDF


Kill Two Birds With One Stone: Weakly-Supervised Neural Network for Image Annotation and Tag Refinement

摘要

The number of social images has exploded by the wide adoption of social networks, and people like to share their comments about them. These comments can be a description of the image, or some objects, attributes, scenes in it, which are normally used as the user-provided tags. However, it is well-known that user-provided tags are incomplete and imprecise to some extent. Directly using them can damage the performance of related applications, such as the image annotation and retrieval. In this paper, we propose to learn an image annotation model and refine the user-provided tags simultaneously in a weakly-supervised manner. The deep neural network is utilized as the image feature learning and backbone annotation model, while visual consistency, semantic dependency, and user-error sparsity are introduced as the constraints at the batch level to alleviate the tag noise. Therefore, our model is highly flexible and stable to handle large-scale image sets. Experimental results on two benchmark datasets indicate that our proposed model achieves the best performance compared to the state-of-the-art methods.

关键词

Weakly-Supervised Neural Network; Social Image Annotation; Tag Refinement

全文

PDF


Face Sketch Synthesis From Coarse to Fine

摘要

Synthesizing fine face sketches from photos is a valuable yet challenging problem in digital entertainment. Face sketches synthesized by conventional methods usually exhibit coarse structures of faces, whereas fine details are lost especially on some critical facial components. In this paper, by imitating the coarse-to-fine drawing process of artists, we propose a novel face sketch synthesis framework consisting of a coarse stage and a fine stage. In the coarse stage, a mapping relationship between face photos and sketches is learned via the convolutional neural network. It ensures that the synthesized sketches keep coarse structures of faces. Given the test photo and the coarse synthesized sketch, a probabilistic graphic model is designed to synthesize the delicate face sketch which has fine and critical details. Experimental results on public face sketch databases illustrate that our proposed framework outperforms the state-of-the-art methods in both quantitive and visual comparisons.

关键词

全文

PDF


Accelerated Training for Massive Classification via Dynamic Class Selection

摘要

Massive classification, a classification task defined over a vast number of classes (hundreds of thousands or even millions), has become an essential part of many real-world systems, such as face recognition. Existing methods, including the deep networks that achieved remarkable success in recent years, were mostly devised for problems with a moderate number of classes. They would meet with substantial difficulties, e.g., excessive memory demand and computational cost, when applied to massive problems. We present a new method to tackle this problem. This method can efficiently and accurately identify a small number of "active classes" for each mini-batch, based on a set of dynamic class hierarchies constructed on the fly. We also develop an adaptive allocation scheme thereon, which leads to a better tradeoff between performance and cost. On several large-scale benchmarks, our method significantly reduces the training cost and memory demand, while maintaining competitive performance.

关键词

Classification; Deep Learning; Softmax

全文

PDF


FLIC: Fast Linear Iterative Clustering With Active Search

摘要

In this paper, we reconsider the clustering problem for image over-segmentation from a new perspective. We propose a novel search algorithm named “active search” which explicitly considers neighboring continuity. Based on this search method, we design a back-and-forth traversal strategy and a "joint" assignment and update step to speed up the algorithm. Compared to earlier works, such as Simple Linear Iterative Clustering (SLIC) and its follow-ups, who use fixed search regions and perform the assignment and the update step separately, our novel scheme reduces the iteration number before convergence, as well as improves boundary sensitivity of the over-segmentation results. Extensive evaluations on the Berkeley segmentation benchmark verify that our method outperforms competing methods under various evaluation metrics. In particular, lowest time cost is reported among existing methods (approximately 30 fps for a 481321 image on a single CPU core). To facilitate the development of over-segmentation, the code will be publicly available.

关键词

VIS: Object Detection; VIS: Object Recognition

全文

PDF


Deep Reinforcement Learning for Unsupervised Video Summarization With Diversity-Representativeness Reward

摘要

Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos. In this paper, we formulate video summarization as a sequential decision-making process and develop a deep summarization network (DSN) to summarize videos. DSN predicts for each video frame a probability, which indicates how likely a frame is selected, and then takes actions based on the probability distributions to select frames, forming video summaries. To train our DSN, we propose an end-to-end, reinforcement learning-based framework, where we design a novel reward function that jointly accounts for diversity and representativeness of generated summaries and does not rely on labels or user interactions at all. During training, the reward function judges how diverse and representative the generated summaries are, while DSN strives for earning higher rewards by learning to produce more diverse and more representative summaries. Since labels are not required, our method can be fully unsupervised. Extensive experiments on two benchmark datasets show that our unsupervised method not only outperforms other state-of-the-art unsupervised methods, but also is comparable to or even superior than most of published supervised approaches.

关键词

Video Summarization; Reinforcement Learning

全文

PDF


Towards Automatic Learning of Procedures From Web Instructional Videos

摘要

The potential for agents, whether embodied or software, to learn by observing other agents performing procedures involving objects and actions is rich. Current research on automatic procedure learning heavily relies on action labels or video subtitles, even during the evaluation phase, which makes them infeasible in real-world scenarios. This leads to our question: can the human-consensus structure of a procedure be learned from a large set of long, unconstrained videos (e.g., instructional videos from YouTube) with only visual evidence? To answer this question, we introduce the problem of procedure segmentation---to segment a video procedure into category-independent procedure segments. Given that no large-scale dataset is available for this problem, we collect a large-scale procedure segmentation dataset with procedure segments temporally localized and described; we use cooking videos and name the dataset YouCook2. We propose a segment-level recurrent network for generating procedure segments by modeling the dependencies across segments. The generated segments can be used as pre-processing for other tasks, such as dense video captioning and event parsing. We show in our experiments that the proposed model outperforms competitive baselines in procedure segmentation.

关键词

Deep Learning; Computer Vision; Artificial Intelligence; Video Understanding; Language and Vision; YouCook2 Dataset

全文

PDF


Graph Correspondence Transfer for Person Re-Identification

摘要

In this paper, we propose a graph correspondence transfer (GCT) approach for person re-identification. Unlike existing methods, the GCT model formulates person re-identification as an off-line graph matching and on-line correspondence transferring problem. In specific, during training, the GCT model aims to learn off-line a set of correspondence templates from positive training pairs with various pose-pair configurations via patch-wise graph matching. During testing, for each pair of test samples, we select a few training pairs with the most similar pose-pair configurations as references, and transfer the correspondences of these references to test pair for feature distance calculation. The matching score is derived by aggregating distances from different references. For each probe image, the gallery image with the highest matching score is the re-identifying result. Compared to existing algorithms, our GCT can handle spatial misalignment caused by large variations in view angles and human poses owing to the benefits of patch-wise graph matching. Extensive experiments on five benchmarks including VIPeR, Road, PRID450S, 3DPES and CUHK01 evidence the superior performance of GCT model over other state-of-the-art methods.

关键词

person re-identification; graph matching

全文

PDF


Progressive Cognitive Human Parsing

摘要

Human parsing is an important task for human-centric understanding. Generally, two mainstreams are used to deal with this challenging and fundamental problem. The first one is employing extra human pose information to generate hierarchical parse graph to deal with human parsing task. Another one is training an end-to-end network with the semantic information in image level. In this paper, we develop an end-to-end progressive cognitive network to segment human parts. In order to establish a hierarchical relationship, a novel component-aware region convolution structure is proposed. With this structure, latter layers inherit prior component information from former layers and pay its attention to a finer component. In this way, we deal with human parsing as a progressive recognition task, that is, we first locate the whole human and then segment the hierarchical components gradually. The experiments indicate that our method has a better location capacity for the small objects and a better classification capacity for the large objects. Moreover, our framework can be embedded into any fully convolutional network to enhance the performance significantly.

关键词

全文

PDF


Learning Adversarial 3D Model Generation With 2D Image Enhancer

摘要

Recent advancements in generative adversarial nets (GANs) and volumetric convolutional neural networks (CNNs) enable generating 3D models from a probabilistic space. In this paper, we have developed a novel GAN-based deep neural network to obtain a better latent space for the generation of 3D models. In the proposed method, an enhancer neural network is introduced to extract information from other corresponding domains (e.g. image) to improve the performance of the 3D model generator, and the discriminative power of the unsupervised shape features learned from the 3D model discriminator. Specifically, we train the 3D generative adversarial networks on 3D volumetric models, and at the same time, the enhancer network learns image features from rendered images. Different from the traditional GAN architecture that uses uninformative random vectors as inputs, we feed the high-level image features learned from the enhancer into the 3D model generator for better training. The evaluations on two large-scale 3D model datasets, ShapeNet and ModelNet, demonstrate that our proposed method can not only generate high-quality 3D models, but also successfully learn discriminative shape representation for classification and retrieval without supervision.

关键词

Shape generation; cross-domain GAN; shape classification; shape retrieval

全文

PDF


Deep Structured Learning for Visual Relationship Detection

摘要

In the research area of computer vision and artificial intelligence, learning the relationships of objects is an important way to deeply understand images. Most of recent works detect visual relationship by learning objects and predicates respectively in feature level, but the dependencies between objects and predicates have not been fully considered. In this paper, we introduce deep structured learning for visual relationship detection. Specifically, we propose a deep structured model, which learns relationship by using feature-level prediction and label-level prediction to improve learning ability of only using feature-level predication. The feature-level prediction learns relationship by discriminative features, and the label-level prediction learns relationships by capturing dependencies between objects and predicates based on the learnt relationship of feature level. Additionally, we use structured SVM (SSVM) loss function as our optimization goal, and decompose this goal into the subject, predicate, and object optimizations which become more simple and more independent. Our experiments on the Visual Relationship Detection (VRD) dataset and the large-scale Visual Genome (VG) dataset validate the effectiveness of our method, which outperforms state-of-the-art methods.

关键词

visual relationship detection; deep structured learning

全文

PDF


HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection

摘要

Visual relationship detection aims to capture interactions between pairs of objects in images. Relationships between objects and humans represent a particularly important subset of this problem, with implications for challenges such as understanding human behavior, and identifying affordances, amongst others. In addressing this problem we first construct a large-scale human-centric visual relationship detection dataset (HCVRD), which provides many more types of relationship annotations (nearly 10K categories) than the previous released datasets. This large label space better reflects the reality of human-object interactions, but gives rise to a long-tail distribution problem, which in turn demands a zero-shot approach to labels appearing only in the test set. This is the first time this issue has been addressed. We propose a webly-supervised approach to these problems and demonstrate that the proposed model provides a strong baseline on our HCVRD dataset.

关键词

全文

PDF


3D Box Proposals From a Single Monocular Image of an Indoor Scene

摘要

Modern object detection methods typically rely on bounding box proposals as input. While initially popularized in the 2D case, this idea has received increasing attention for 3D bounding boxes. Nevertheless, existing 3D box proposal techniques all assume having access to depth as input, which is unfortunately not always available in practice. In this paper, we therefore introduce an approach to generating 3D box proposals from a single monocular RGB image. To this end, we develop an integrated, fully differentiable framework that inherently predicts a depth map, extracts a 3D volumetric scene representation and generates 3D object proposals. At the core of our approach lies a novel residual, differentiable truncated signed distance function module, which, accounting for the relatively low accuracy of the predicted depth map, extracts a 3D volumetric representation of the scene. Our experiments on the standard NYUv2 dataset demonstrate that our framework lets us generate high-quality 3D box proposals and that it outperforms the two-stage technique consisting of successively performing state-of-the-art depth prediction and depth-based 3D proposal generation.

关键词

Indoor Scene Understanding; 3D Box Proposal; Deep Learning

全文

PDF


Understanding Human Behaviors in Crowds by Imitating the Decision-Making Process

摘要

Crowd behavior understanding is crucial yet challenging across a wide range of applications, since crowd behavior is inherently determined by a sequential decision-making process based on various factors, such as the pedestrians' own destinations, interaction with nearby pedestrians and anticipation of upcoming events. In this paper, we propose a novel framework of Social-Aware Generative Adversarial Imitation Learning (SA-GAIL) to mimic the underlying decision-making process of pedestrians in crowds. Specifically, we infer the latent factors of human decision-making process in an unsupervised manner by extending the Generative Adversarial Imitation Learning framework to anticipate future paths of pedestrians. Different factors of human decision making are disentangled with mutual information maximization, with the process modeled by collision avoidance regularization and Social-Aware LSTMs. Experimental results demonstrate the potential of our framework in disentangling the latent decision-making factors of pedestrians and stronger abilities in predicting future trajectories.

关键词

Vision, crowd, Imitation Learning

全文

PDF


Secure and Automated Enterprise Revenue Forecasting

摘要

Revenue forecasting is required by most enterprises for strategic business planning and for providing expected future results to investors. However, revenue forecasting processes in most companies are time-consuming and error-prone as they are performed manually by hundreds of financial analysts. In this paper, we present a novel machine learning based revenue forecasting solution that we developed to forecast 100% of Microsoft's revenue (around $85 Billion in 2016), and is now deployed into production as an end-to-end automated and secure pipeline in Azure. Our solution combines historical trend and seasonal patterns with additional information, e.g., sales pipeline data, within a unified modeling framework. In this paper, we describe our framework including the features, method for hyperparameters tuning of ML models using time series cross-validation, and generation of prediction intervals. We also describe how we architected an end-to-end secure and automated revenue forecasting solution on Azure using Cortana Intelligence Suite. Over consecutive quarters, our machine learning models have continuously produced forecasts with an average accuracy of 98-99 percent for various divisions within Microsoft's Finance organization. As a result, our models have been widely adopted by them and are now an integral part of Microsoft's most important forecasting processes, from providing Wall Street guidance to managing global sales performance.

关键词

time series forecasting, enterprise revenue forecasting, secure cloud computing

全文

PDF


Sketch Worksheets in STEM Classrooms: Two Deployments

摘要

Sketching can be a valuable tool for science education, but it is currently underutilized. Sketch worksheets were developed to help change this, by using AI technology to give students immediate feedback and to give instructors assistance in grading. Sketch worksheets use visual representations automatically computed by CogSketch, which are combined with conceptual information from the OpenCyc ontology. Feedback is provided to students by comparing an instructor’s sketch to a student’s sketch, using the Structure-Mapping Engine. This paper describes our experiences in deploying sketch worksheets in two types of classes: Geoscience and AI. Sketch worksheets for introductory geoscience classes were developed by geoscientists at University of Wisconsin-Madison, authored using CogSketch and used in classes at both Wisconsin and Northwestern University. Sketch worksheets were also developed and deployed for a knowledge representation and reasoning course at Northwestern. Our experience indicates that sketch worksheets can provide helpful on-the-spot feedback to students, and significantly improve grading efficiency, to the point where sketching assignments can be more practical to use broadly in STEM education.

关键词

Sketch Understanding; Analogy; Intelligent Tutoring; Educational Software

全文

PDF


An Automated Employee Timetabling System for Small Businesses

摘要

Employee scheduling is one of the most difficult challenges facing any small business owner. The problem becomes more complex when employees with different levels of seniority indicate preferences for specific roles in certain shifts and request flexible work hours outside of the standard eight-hour block. Many business owners and managers, who cannot afford (or choose not to use) commercially-available timetabling apps, spend numerous hours creating sub-optimal schedules by hand, leading to low staff morale. In this paper, we explain how two undergraduate students generalized the Nurse Scheduling Problem to take into account multiple roles and flexible work hours, and implemented a user-friendly automated timetabler based on a four-dimensional integer linear program. This system has been successfully deployed at two businesses in our community, each with 20+ employees: a coffee shop and a health clinic.

关键词

constraint optimization; constraint satisfaction; scheduling; combinatorial optimization

全文

PDF


Horizontal Scaling With a Framework for Providing AI Solutions Within a Game Company

摘要

Games have been a major focus of AI since the field formed seventy years ago. Recently, video games have replaced chess and go as the current "Mt. Everest Problem." This paper looks beyond the video games themselves to the application of AI techniques within the ecosystems that produce them. Electronic Arts (EA) must deal with AI at scale across many game studios as it develops many AAA games each year, and not a single, AI-based, flagship application. EA has adopted a horizontal scaling strategy in response to this challenge and built a platform for delivering AI artifacts anywhere within EA's software universe. By combining a data warehouse for player history, an Agent Store for capturing processes acquired through machine learning, and a recommendation engine as an action layer, EA has been delivering a wide range of AI solutions throughout the company during the last two years. These solutions, such as dynamic difficulty adjustment, in-game content and activity recommendations, matchmaking, and game balancing, have had major impact on engagement, revenue, and development resources within EA.

关键词

全文

PDF


Hi, How Can I Help You?: Automating Enterprise IT Support Help Desks

摘要

Question answering is one of the primary challenges of natural language understanding. In realizing such a system, providing complex long answers to questions is a challenging task as opposed to factoid answering as the former needs context disambiguation. The different methods explored in the literature can be broadly classified into three categories namely: 1) classification based, 2) knowledge graph based and 3) retrieval based. Individually, none of them address the need of an enterprise wide assistance system for an IT support and maintenance domain. In this domain, the variance of answers is large ranging from factoid to structured operating procedures; the knowledge is present across heterogeneous data sources like application specific documentation, ticket management systems and any single technique for a general purpose assistance is unable to scale for such a landscape. To address this, we have built a cognitive platform with capabilities adopted for this domain. Further, we have built a general purpose question answering system leveraging the platform that can be instantiated for multiple products, technologies in the support domain. The system uses a novel hybrid answering model that orchestrates across a deep learning classifier, a knowledge graph based context disambiguation module and a sophisticated bag-of-words search system. This orchestration performs context switching for a provided question and also does a smooth hand-off of the question to a human expert if none of the automated techniques can provide a confident answer. This system has been deployed across 675 internal enterprise IT support and maintenance projects.

关键词

Question Answering, Cognitive Ensemble, IT support, Machine Learning

全文

PDF


Sentient Ascend: AI-Based Massively Multivariate Conversion Rate Optimization

摘要

Conversion rate optimization (CRO) means designing an e-commerce web interface so that as many users as possible take a desired action such as registering for an account, requesting a contact, or making a purchase. Such design is usually done by hand, evaluating one change at a time through A/B testing, or evaluating all combinations of two or three variables through multivariate testing. Traditional CRO is thus limited to a small fraction of the design space only. This paper describes Sentient Ascend, an automatic CRO system that uses evolutionary search to discover effective web interfaces given a human-designed search space. Design candidates are evaluated in parallel on line with real users, making it possible to discover and utilize interactions between the design elements that are difficult to identify otherwise. A commercial product since September 2016, Ascend has been applied to numerous web interfaces across industries and search space sizes, with up to four-fold improvements over human design. Ascend can therefore be seen as massively multivariate CRO made possible by AI.

关键词

Conversion optimization; e-commerce; evolutionary computation; design

全文

PDF


SmartHS: An AI Platform for Improving Government Service Provision

摘要

Over the years, government service provision in China has been plagued by inefficiencies. Previous attempts to address this challenge following a toolbox e-government system model in China were not effective. In this paper, we report on a successful experience in improving government service provision in the domain of social insurance in Shandong Province, China. Through standardization of service workflows following the Complete Contract Theory (CCT) and the infusion of an artificial intelligence (AI) engine to maximize the expected quality of service while reducing waiting time, the Smart Human-resource Services (SmartHS) platform transcends organizational boundaries and improves system efficiency. Deployments in 3 cities involving 2,000 participating civil servants and close to 3 million social insurance service cases over a 1 year period demonstrated that SmartHS significantly improves user experience with roughly a third of the original front desk staff. This new AI-enhanced mode of operation is useful for informing current policy discussions in many domains of government service provision.

关键词

Optimization; Social Insurance

全文

PDF


Bandit-Based Solar Panel Control

摘要

Solar panels sustainably harvest energy from the sun. To improve performance, panels are often equipped with a tracking mechanism that computes the sun’s position in the sky throughout the day. Based on the tracker’s estimate of the sun’s location, a controller orients the panel to minimize the angle of incidence between solar radiant energy and the photovoltaic cells on the surface of the panel, increasing total energy harvested. Prior work has developed efficient tracking algorithms that accurately compute the sun’s location to facilitate solar tracking and control. However, always pointing a panel directly at the sun does not account for diffuse irradiance in the sky, reflected irradiance from the ground and surrounding surfaces, power required to reorient the panel, shading effects from neighboring panels and foliage, or changing weather conditions (such as clouds), all of which are contributing factors to the total energy harvested by a fleet of solar panels. In this work, we show that a bandit-based approach can increase the total energy harvested by solar panels by learning to dynamically account for such other factors. Our contribution is threefold: (1) the development of a test bed based on typical solar and irradiance models for experimenting with solar panel control using a variety of learning methods, (2) simulated validation that bandit algorithms can effectively learn to control solar panels, and (3) the design and construction of an intelligent solar panel prototype that learns to angle itself using bandit algorithms.

关键词

Solar Panels; Solar Tracking; Reinforcement Learning; Computational Sustainability; Contextual Bandit

全文

PDF


Death vs. Data Science: Predicting End of Life

摘要

Death is an inevitable part of life and while it cannot be delayed indefinitely it is possible to predict with some certainty when the health of a person is going to deteriorate. In this paper, we predict risk of mortality for patients from two large hospital systems in the Pacific Northwest. Using medical claims and electronic medical records (EMR) data we greatly improve prediction for risk of mortality and explore machine learning models with explanations for end of life predictions. The insights that are derived from the predictions can then be used to improve the quality of patient care towards the end of life.

关键词

eol, risk prediction, emr, claims data, heath risk prediction

全文

PDF


CRM Sales Prediction Using Continuous Time-Evolving Classification

摘要

Customer Relationship Management (CRM) systems play an important role in helping companies identify and keep sales and service prospects. CRM service providers offer a range of tools and techniques that will help find, sell to and keep customers. To be effective, CRM users usually require extensive training. Predictive CRM using machine learning expands the capabilities of traditional CRM through the provision of predictive insights for CRM users by combining internal and external data. In this paper, we will explore a novel idea of computationally learning salesmanship, its patterns and success factors to drive industry intuitions for a more predictable road to a vehicle sale. The newly discovered patterns and insights are used to act as a virtual guide or trainer for the general CRM user population.

关键词

Automotive CRM; Predictive CRM; Predictive Analytics; Machine Learning; Sales Prediction;

全文

PDF


Novel Exploration Techniques (NETs) for Malaria Policy Interventions

摘要

The task of decision-making under uncertainty is daunting, especially for problems which have significant complexity. Healthcare policy makers across the globe are facing problems under challenging constraints, with limited tools to help them make data driven decisions. In this work we frame the process of finding an optimal malaria policy as a stochastic multi-armed bandit problem, and implement three agent based strategies to explore the policy space. We apply a Gaussian Process regression to the findings of each agent, both for comparison and to account for stochastic results from simulating the spread of malaria in a fixed population. The generated policy spaces are compared with published results to give a direct reference with human expert decisions for the same simulated population. Our novel approach provides a powerful resource for policy makers, and a platform which can be readily extended to capture future more nuanced policy spaces.

关键词

Malaria; Reinforcement Learning; Bandit; Gaussian Process

全文

PDF


SPOT Poachers in Action: Augmenting Conservation Drones With Automatic Detection in Near Real Time

摘要

The unrelenting threat of poaching has led to increased development of new technologies to combat it. One such example is the use of long wave thermal infrared cameras mounted on unmanned aerial vehicles (UAVs or drones) to spot poachers at night and report them to park rangers before they are able to harm animals. However, monitoring the live video stream from these conservation UAVs all night is an arduous task. Therefore, we build SPOT (Systematic POacher deTector), a novel application that augments conservation drones with the ability to automatically detect poachers and animals in near real time. SPOT illustrates the feasibility of building upon state-of-the-art AI techniques, such as Faster RCNN, to address the challenges of automatically detecting animals and poachers in infrared images. This paper reports (i) the design and architecture of SPOT, (ii) a series of efforts towards more robust and faster processing to make SPOT usable in the field and provide detections in near real time, and (iii) evaluation of SPOT based on both historical videos and a real-world test run by the end users in the field. The promising results from the test in the field have led to a plan for larger-scale deployment in a national park in Botswana. While SPOT is developed for conservation drones, its design and novel techniques have wider application for automated detection from UAV videos.

关键词

unmanned aerial vehicles; object detection; anti-poaching

全文

PDF


InspireMe: Learning Sequence Models for Stories

摘要

We present a novel approach to modeling stories using recurrent neural networks. Different story features are extracted using natural language processing techniques and used to encode the stories as sequences. These sequences can be learned by deep neural networks, in order to predict the next story events. The predictions can be used as an inspiration for writers who experience a writer's block. We further assist writers in their creative process by generating visualizations of the character interactions in the story. We show that suggestions from our model are rated as highly as the real scenes from a set of films and that our visualizations can help people in gaining deeper story understanding.

关键词

narrative intelligence; deep learning; natural language processing; writer's block; creative writing

全文

PDF


Assessing National Development Plans for Alignment With Sustainable Development Goals via Semantic Search

摘要

The United Nations Development Programme (UNDP) helps countries implement the United Nations (UN) Sustainable Development Goals (SDGs), an agenda for tackling major societal issues such as poverty, hunger, and environmental degradation by the year 2030. A key service provided by UNDP to countries that seek it is a review of national development plans and sector strategies by policy experts to assess alignment of national targets with one or more of the 169 targets of the 17 SDGs. Known as the Rapid Integrated Assessment (RIA), this process involves manual review of hundreds, if not thousands, of pages of documents and takes weeks to complete. In this work, we develop a natural language processing-based methodology to accelerate the workflow of policy experts. Specifically we use paragraph embedding techniques to find paragraphs in the documents that match the semantic concepts of each of the SDG targets. One novel technical contribution of our work is in our use of historical RIAs from other countries as a form of neighborhood-based supervision for matches in the country under study. We have successfully piloted the algorithm to perform the RIA for Papua New Guinea’s national plan, with the UNDP estimating it will help reduce their completion time from an estimated 3-4 weeks to 3 days.

关键词

Social Good; United Nations; Semantic Search; Sustainable Development Goals; Word embeddings; word2vec; tf-idf

全文

PDF


Classification of Malware by Using Structural Entropy on Convolutional Neural Networks

摘要

The number of malicious programs has grown both in number and in sophistication. Analyzing the malicious intent of vast amounts of data requires huge resources and thus, effective categorization of malware is required. In this paper, the content of a malicious program is represented as an entropy stream, where each value describes the amount of entropy of a small chunk of code in a specific location of the file. Wavelet transforms are then applied to this entropy signal to describe the variation in the entropic energy. Motivated by the visual similarity between streams of entropy of malicious software belonging to the same family, we propose a file agnostic deep learning approach for categorization of malware. Our method exploits the fact that most variants are generated by using common obfuscation techniques and that compression and encryption algorithms retain some properties present in the original code. This allows us to find discriminative patterns that almost all variants in a family share. Our method has been evaluated using the data provided by Microsoft for the BigData Innovators Gathering Anti-Malware Prediction Challenge, and achieved promising results in comparison with the State of the Art.

关键词

Malware Classification; Entropy Analysis; Convolutional Neural Networks

全文

PDF


Optimal Pricing for Distance-Based Transit Fares

摘要

Numerous urban planners advocate for differentiated transit pricing to improve both ridership and service equity. Several metropolitan cities are considering switching to a more "fair fare system," where passengers pay according to the distance travelled, rather than a flat fare or zone boundary scheme that discriminates against various marginalized groups. In this paper, we present a two-part optimal pricing formula for switching to distance-based transit fares: the first formula maximizes forecasted revenue given a target ridership, and the second formula maximizes forecasted ridership given a target revenue. Both formulas hold for all price elasticities. Our theory has been successfully tested on the SkyTrain mass transit network in Metro Vancouver, British Columbia, with over 400,000 daily passengers. This research has served Metro Vancouver's transportation authority as they consider changing their fare structure for the first time in over 30 years.

关键词

Constraint Optimization; quadratic programming; quadratic optimization; transit;

全文

PDF


Discovering Program Topoi Through Clustering

摘要

Understanding source code of large open-source software projects is very challenging when there is only little documentation. New developers face the task of classifying a huge number of files and functions without any help. This paper documents a novel approach to this problem, called FEAT, that automatically extracts topoi from source code by using hierarchical agglomerative clustering. Program topoi summarize the main capabilities of a software system by presenting to developers clustered lists of functions together with an index of their relevant words. The clustering method used in FEAT exploits a new hybrid distance which combines both textual and structural elements automatically extracted from source code and comments. The experimental evaluation of FEAT shows that this approach is suitable to understand open-source software projects of size approaching 2,000 functions and 150 files, which opens the door for its deployment in the open-source community.

关键词

program understanding, clustering, feature extraction, feature location

全文

PDF


Upping the Game of Taxi Driving in the Age of Uber

摘要

In most cities, taxis play an important role in providing point-to-point transportation service. If the taxi service is reliable, responsive, and cost-effective, past studies show that taxi-like services can be a viable choice in replacing a significant amount of private cars. However, making taxi services efficient is extremely challenging, mainly due to the fact that taxi drivers are self-interested and they operate with only local information. Although past research has demonstrated how recommendation systems could potentially help taxi drivers in improving their performance, most of these efforts are not feasible in practice. This is mostly due to the lack of both the comprehensive data coverage and an efficient recommendation engine that can scale to tens of thousands of drivers. In this paper, we propose a comprehensive working platform called the Driver Guidance System (DGS). With real-time citywide taxi data provided by our collaborator in Singapore, we demonstrate how we can combine real-time data analytics and large-scale optimization to create a guidance system that can potentially benefit tens of thousands of taxi drivers. Via a realistic agent-based simulation, we demonstrate that drivers following DGS can significantly improve their performance over ordinary drivers, regardless of the adoption ratios. We have concluded our system designing and building and have recently entered the field trial phase.

关键词

mobility-on-demand; taxi driver guidance

全文

PDF


Mobile Network Failure Event Detection and Forecasting With Multiple User Activity Data Sets

摘要

As the demand for mobile network services increases, immediate detection and forecasting of network failure events have become important problems for service providers. Several event detection approaches have been proposed to tackle these problems by utilizing social data. However, these approaches have not tried to solve event detection and forecasting problems from multiple data sets, such as web access logs and search queries. In this paper, we propose a machine learning approach that incorporates multiple user activity data into detecting and forecasting failure events. Our approach is based on a two-level procedure. First, we introduce a novel feature construction method that treats both the imbalanced label problem and the data sparsity problem of user activity data. Second, we propose a model ensemble method that combines outputs of supervised and unsupervised learning models for each data set and gives accurate predictions of network service outage. We demonstrate the effectiveness of the proposed models by extensive experiments with real-world failure events occurred at a network service provider in Japan and three user activity data sets.

关键词

Event Detection; Event Forecasting; Network Failure Event; Multiple Data Sets; Ensemble Model

全文

PDF


Multi-Task Deep Learning for Predicting Poverty From Satellite Images

摘要

Estimating economic and developmental parameters such as poverty levels of a region from satellite imagery is a challenging problem that has many applications. We propose a two step approach to predict poverty in a rural region from satellite imagery. First, we engineer a multi-task fully convolutional deep network for simultaneously predicting the material of roof, source of lighting and source of drinking water from satellite images. Second, we use the predicted developmental statistics to estimate poverty. Using full-size satellite imagery as input, and without pre-trained weights, our models are able to learn meaningful features including roads, water bodies and farm lands, and achieve a performance that is close to the optimum. In addition to speeding up the training process, the multi-task fully convolutional model is able to discern task specific and independent feature representations.

关键词

deep learning; satellite image analysis; poverty prediction

全文

PDF


Investigating the Role of Ensemble Learning in High-Value Wine Identification

摘要

We tackle the problem of authenticating high value Italian wines through machine learning classification. The problem is a seriuos one, since protection of high quality wines from forgeries is worth several million of Euros each year. In a previous work we have identified some base models (in particular classifiers based on Bayesian network (BNC), multi-layer perceptron (MLP) and sequential minimal optimization (SMO)) that well behave using unexpensive chemical analyses of the interested wines. In the present paper, we investigate the role of esemble learning in the construction of more robust classifiers; results suggest that, while bagging and boosting may significantly improve both BNC and MLP, the SMO model is already very robust and efficient as a base learner. We report on results concerning both cross validation on two different datasets, as well as experiments with models trained with the above datasets and tested with a dataset of potentially fake wines; this has been synthesized from a generative probabilistic model learned from real samples and expert knowledge.

关键词

ensemble learning; fraud detection; wine classification

全文

PDF


Learning to Become an Expert: Deep Networks Applied to Super-Resolution Microscopy

摘要

With super-resolution optical microscopy, it is now possible to observe molecular interactions in living cells. The obtained images have a very high spatial precision but their overall quality can vary a lot depending on the structure of interest and the imaging parameters. Moreover, evaluating this quality is often difficult for non-expert users. In this work, we tackle the problem of learning the quality function of super-resolution images from scores provided by experts. More specifically, we are proposing a system based on a deep neural network that can provide a quantitative quality measure of a STED image of neuronal structures given as input. We conduct a user study in order to evaluate the quality of the predictions of the neural network against those of a human expert. Results show the potential while highlighting some of the limits of the proposed approach.

关键词

deep neural network regression;super-resolution microscopy;image quality

全文

PDF


Computer-Assisted Authoring for Natural Language Story Scripts

摘要

In order to assist scriptwriters during the process of story-writing, we have developed a system that can extract information from natural language stories, and allow for story-centric as well as character-centric reasoning. These inferencing capabilities are exposed to the user through intuitive querying systems, allowing the scriptwriter to ask the system questions about story and character information. We introduce knowledge bytes as atoms of information and demonstrate that the system can parse text into a stream of knowledge bytes and use these mentioned reasoning capabilities through logical reasoning.

关键词

Knowledge Based Systems; Knowledge Representation; Natural Language; Reasoning; Information Extraction; Real-time Systems

全文

PDF


A Water Demand Prediction Model for Central Indiana

摘要

Due to the limited natural water resources and the increase in population, managing water consumption is becoming an increasingly important subject worldwide. In this paper, we present and compare different machine learning models that are able to predict water demand for Central Indiana. The models are developed for two different time scales: daily and monthly. The input features for the proposed model include weather conditions (temperature, rainfall, snow), social features (holiday, median income), date (day of the year, month), and operational features (number of customers, previous water demand levels). The importance of these input features as accurate predictors is investigated. The results show that daily and monthly models based on recurrent neural networks produced the best results with an average error in prediction of 1.69% and 2.29%, respectively for 2016. These models achieve a high accuracy with a limited set of input features.

关键词

Prediction; Modeling; Neural Networks

全文

PDF


TipMaster: A Knowledge Base of Authoritative Local News Sources on Social Media

摘要

Twitter has become an important online source for real-time news dissemination. Especially, official accounts of local government and media outlets have provided newsworthy and authoritative information, revealing local trends and breaking news. In this paper, we describe TipMaster an automatically constructed knowledge base of Twitter accounts that are likely to report local news, from government agencies to local media outlets. First, we implement classifiers for detecting these accounts by integrating heterogeneous information from the accounts' textual metadata, profile images, and their tweet messages. Next, we demonstrate two use cases for TipMaster: 1) as a platform that monitors real-time social media messages for local breaking news, and 2) as an authoritative source for verifying nascent rumors. Experimental results show that our account classification algorithms achieve both high precision and recall (around 90%). The demonstrated case studies prove that our platform is able to detect local breaking news or debunk emergent rumors faster than mainstream media sources.

关键词

twitter; news; timeliness; social data;

全文

PDF


Adapting to Concept Drift in Credit Card Transaction Data Streams Using Contextual Bandits and Decision Trees

摘要

Credit card transactions predicted to be fraudulent by automated detection systems are typically handed over to human experts for verification. To limit costs, it is standard practice to select only the most suspicious transactions for investigation. We claim that a trade-off between exploration and exploitation is imperative to enable adaptation to changes in behavior (concept drift). Exploration consists of the selection and investigation of transactions with the purpose of improving predictive models, and exploitation consists of investigating transactions detected to be suspicious. Modeling the detection of fraudulent transactions as rewarding, we use an incremental Regression Tree learner to create clusters of transactions with similar expected rewards. This enables the use of a Contextual Multi-Armed Bandit (CMAB) algorithm to provide the exploration/exploitation trade-off. We introduce a novel variant of a CMAB algorithm that makes use of the structure of this tree, and use Semi-Supervised Learning to grow the tree using unlabeled data. The approach is evaluated on a real dataset and data generated by a simulator that adds concept drift by adapting the behavior of fraudsters to avoid detection. It outperforms frequently used offline models in terms of cumulative rewards, in particular in the presence of concept drift.

关键词

fraud; credit card; contextual bandit; decision tree; concept drift

全文

PDF


Aida: Intelligent Image Analysis to Automatically Detect Poems in Digital Archives of Historic Newspapers

摘要

We describe an intelligent image analysis approach to automatically detect poems in digitally archived historic newspapers. Our application, Image Analysis for Archival Discovery, or Aida, integrates computer vision to capture visual cues based on visual structures of poetic works—instead of the meaning or content—and machine learning to train an artificial neural network to determine whether an image has poetic text. We have tested our application on almost 17,000 image snippets and obtained promising accuracies, precision, and recall. The application is currently being deployed at two institutions for digital library and literary research.

关键词

全文

PDF


VoC-DL: Revisiting Voice Of Customer Using Deep Learning

摘要

In the field of digital marketing, understanding the voice of the customer is paramount. Mining textual content written by visitors on websites or social media can offer new dimensions to marketers and CX executives. Traditional tasks in NLP like sentiment analysis, topic modeling etc. can solve only certain specific problems but don’t provide a generic solution to identifying/understanding the intention behind a text. In this paper we consider higher dimensional extensions to the sentiment concept by incorporating labels like product enquiry, buying intent, seeking help, feedback and pricing query which give us a deeper understanding of the text. We show how our model performs in a real-world enterprise use case. Word2Vec embeddings are used for word representations and later we compare three algorithms for classification. SVM’s provide us with a strong baseline. Two deep learning models viz. vanilla CNN and RNN’s with LSTM are compared. With no use of hard-coded or hand engineered features, our generic model can be used in a variety of use cases where text mining is involved with ease.

关键词

Voice of Customer; Text Classification; Deep Learning;

全文

PDF


DarkEmbed: Exploit Prediction With Neural Language Models

摘要

Software vulnerabilities can expose computer systems to attacks by malicious actors. With the number of vulnerabilities discovered in the recent years surging, creating timely patches for every vulnerability is not always feasible. At the same time, not every vulnerability will be exploited by attackers; hence, prioritizing vulnerabilities by assessing the likelihood they will be exploited has become an important research problem. Recent works used machine learning techniques to predict exploited vulnerabilities by analyzing discussions about vulnerabilities on social media. These methods relied on traditional text processing techniques, which represent statistical features of words, but fail to capture their context. To address this challenge, we propose DarkEmbed, a neural language modeling approach that learns low dimensional distributed representations, i.e., embeddings, of darkweb/deepweb discussions to predict whether vulnerabilities will be exploited. By capturing linguistic regularities of human language, such as syntactic, semantic similarity and logic analogy, the learned embeddings are better able to classify discussions about exploited vulnerabilities than traditional text analysis methods. Evaluations demonstrate the efficacy of learned embeddings on both structured text (such as security blog posts) and unstructured text (darkweb/deepweb posts). DarkEmbed outperforms state-of-the-art approaches on the exploit prediction task with an F1-score of 0.74.

关键词

Cyber Security; Exploit Prediction; Machine Learning; Neural Language Models

全文

PDF


Gesture Annotation With a Visual Search Engine for Multimodal Communication Research

摘要

Human communication is multimodal and includes elements such as gesture and facial expression along with spoken language. Modern technology makes it feasible to capture all such aspects of communication in natural settings. As a result, similar to fields such as genetics, astronomy and neuroscience, scholars in areas such as linguistics and communication studies are on the verge of a data-driven revolution in their fields. These new approaches require analytical support from machine learning and artificial intelligence to develop tools to help process the vast data repositories. The Distributed Little Red Hen Lab project is an international team of interdisciplinary researchers building a large-scale infrastructure for data-driven multimodal communications research. In this paper, we describe a machine learning system developed to automatically annotate a large database of television program videos as part of this project. The annotations mark regions where people or speakers are on screen along with body part motions including head, hand and shoulder motion. We also annotate a specific class of gestures known as timeline gestures. An existing gesture annotation tool, ELAN, can be used with these annotations to quickly locate gestures of interest. Finally, we provide an update mechanism for the system based on human feedback. We empirically evaluate the accuracy of the system as well as present data from pilot human studies to show its effectiveness at aiding gesture scholars in their work.

关键词

multimodal communication; gesture recognition; computer vision

全文

PDF


Mars Target Encyclopedia: Rock and Soil Composition Extracted From the Literature

摘要

We have constructed an information extraction system called the Mars Target Encyclopedia that takes in planetary science publications and extracts scientific knowledge about target compositions. The extracted knowledge is stored in a searchable database that can greatly accelerate the ability of scientists to compare new discoveries with what is already known. To date, we have applied this system to ~6000 documents and achieved 41-56% precision in the extracted information.

关键词

machine learning; information extraction; planetary science

全文

PDF


Deep Mars: CNN Classification of Mars Imagery for the PDS Imaging Atlas

摘要

NASA has acquired more than 22 million images from the planet Mars. To help users find images of interest, we developed a content-based search capability for Mars rover surface images and Mars orbital images. We started with the AlexNet convolutional neural network, which was trained on Earth images, and used transfer learning to adapt the network for use with Mars images. We report on our deployment of these classifiers within the PDS Imaging Atlas, a publicly accessible web interface, to enable the first content-based image search for NASA’s Mars images.

关键词

machine learning; image classification; planetary science

全文

PDF


Is a Picture Worth a Thousand Words? A Deep Multi-Modal Architecture for Product Classification in E-Commerce

摘要

Classifying products precisely and efficiently is a major challenge in modern e-commerce. The high traffic of new products uploaded daily and the dynamic nature of the categories raise the need for machine learning models that can reduce the cost and time of human editors. In this paper, we propose a decision level fusion approach for multi-modal product classification based on text and image neural network classifiers. We train input specific state-of-the-art deep neural networks for each input source, show the potential of forging them together into a multi-modal architecture and train a novel policy network that learns to choose between them. Finally, we demonstrate that our multi-modal network improves classification accuracy over both networks on a real-world large-scale product classification dataset that we collected from Walmart.com. While we focus on image-text fusion that characterizes e-commerce businesses, our algorithms can be easily applied to other modalities such as audio, video, physical sensors, etc.

关键词

Multi Modality; E-Commerce; Deep Learning

全文

PDF


Batting Order Setup in One Day International Cricket

摘要

In the professional sport of cricket, batting order assignment is of significant interest and importance to coaches, players, and fans as an influencing parameter on the game outcome. The impact of batting order on scoring runs is widely known and managers are often judged based on their perceived weakness or strength in setting the batting order. In practice, a combination of experts’ intuitions plus a few descriptive and sometimes conflicting performance statistics are used to assign an order to the batters in a team line-up before the games and in player replacement due to injuries during the games. In this paper, we propose the use of learning methods in automatic line-up order assignment based on several measures of performance and historical data. We discuss the importance of this problem in designing a winning strategy for cricket teams and the challenges this application introduces to the community and the currently existing approaches in AI.

关键词

cricket analytics; performance evaluation; optimization

全文

PDF


AI Challenges in Synthetic Biology Engineering

摘要

A wide variety of Artificial Intelligence (AI) techniques, from expert systems to machine learning to robotics, are needed in the field of synthetic biology. This paper describes the design-build-test engineering cycle and lists some challenges in which AI can help.

关键词

Synthetic Biology; Challenges; knowledge-based systems; knowledge representation; semantic networks; frame representations; machine learning; hypothesis generation; expert systems; constraint-based reasoning; planning under uncertainty; robotics

全文

PDF


Data Analysis Competition Platform for Educational Purposes: Lessons Learned and Future Challenges

摘要

Data analysis education plays an important role in accelerating the efficient use of data analysis technologies in various domains. Not only the knowledge of statistics and machine learning, but also practical skills of deploying machine learning and data analysis techniques, are required for conducting data analysis projects in the real world. Data analysis competitions, such as Kaggle, have been considered as an efficient system for learning such skills by addressing real data analysis problems. However, current data analysis competitions are not designed for educational purposes and it is not well studied how data analysis competition platforms should be designed for enhancing educational effectiveness. To answer this research question, we built, and subsequently operated an educational data analysis competition platform called University of Big Data for several years. In this paper, we present our approaches for supporting and motivating learners and the results of our case studies. We found that providing a tutorial article is beneficial for encouraging active participation of learners, and a leaderboard system allowing an unlimited number of submissions can motivate the efforts of learners. We further discuss future directions of educational data analysis competitions.

关键词

全文

PDF


Gesturing and Embodiment in Teaching: Investigating the Nonverbal ‎Behavior of Teachers in a Virtual Rehearsal Environment ‎

摘要

Interactive training environments typically include feedback mechanisms designed to help trainees improve their performance through either guided or self-reflection. In this context, trainees are candidate teachers who need to hone their social skills as well as other pedagogical skills for their future classroom. We chose an avatar-mediated interactive virtual training system–TeachLivE–as the basic research environment to investigate the motions and embodiment of the trainees. Using tracking sensors, and customized improvements for existing gesture recognition utilities, we created a gesture database and employed it for the implementation of our real-time gesture recognition and feedback application. We also investigated multiple methods of feedback provision, including visual and haptics. The results from the conducted user studies and user evaluation surveys indicate the positive impact of the proposed feedback applications and informed body language. In this paper, we describe the context in which the utilities have been developed, the importance of recognizing nonverbal communication in the teaching context, the means of providing automated feedback associated with nonverbal messaging, and the preliminary studies developed to inform the research.

关键词

Nonverbal Behavior; Embomiment; Education and Training; TeachLivE; Microsoft Kinect

全文

PDF


Introducing Ethical Thinking About Autonomous Vehicles Into an AI Course

摘要

A computer science faculty member and a philosophy faculty member collaborated in the development of a one-week introduction to ethics which was integrated into a traditional AI course. The goals were to: (1) encourage students to think about the moral complexities involved in developing accident algorithms for autonomous vehicles, (2) identify what issues need to be addressed in order to develop a satisfactory solution to the moral issues surrounding these algorithms, and (3) and to offer students an example of how computer scientists and ethicists must work together to solve a complex technical and moral problems. The course module introduced Utilitarianism and engaged students in considering the classic "Trolley Problem," which has gained contemporary relevance with the emergence of autonomous vehicles. Students used this introduction to ethics in thinking through the implications of their final projects. Results from the module indicate that students gained some fluency with Utilitarianism, including a strong understanding of the Trolley Problem. This short paper argues for the need of providing students with instruction in ethics in AI course. Given the strong alignment between AI's decision-theoretic approaches and Utilitarianism, we highlight the difficulty of encouraging AI students to challenge these assumptions.

关键词

artificial intelligence; teaching; ethics; autonomous vehicles; trolley problem; utilitarianism

全文

PDF


Dropout Model Evaluation in MOOCs

摘要

The field of learning analytics needs to adopt a more rigorous approach for predictive model evaluation that matches the complex practice of model-building. In this work, we present a procedure to statistically test hypotheses about model performance which goes beyond the state-of-the-practice in the community to analyze both algorithms and feature extraction methods from raw data. We apply this method to a series of algorithms and feature sets derived from a large sample of Massive Open Online Courses (MOOCs). While a complete comparison of all potential modeling approaches is beyond the scope of this paper, we show that this approach reveals a large gap in dropout prediction performance between forum-, assignment-, and clickstream-based feature extraction methods, where the latter is significantly better than the former two, which are in turn indistinguishable from one another. This work has methodological implications for evaluating predictive or AI-based models of student success, and practical implications for the design and targeting of at-risk student models and interventions.

关键词

MOOCs; Model Evaluation; Applied Statistics; Predictive Modeling

全文

PDF


Investigating Active Learning for Concept Prerequisite Learning

摘要

Concept prerequisite learning focuses on machine learning methods for measuring the prerequisite relation among concepts. With the importance of prerequisites for education, it has recently become a promising research direction. A major obstacle to extracting prerequisites at scale is the lack of large-scale labels which will enable effective data-driven solutions. We investigate the applicability of active learning to concept prerequisite learning.We propose a novel set of features tailored for prerequisite classification and compare the effectiveness of four widely used query strategies. Experimental results for domains including data mining, geometry, physics, and precalculus show that active learning can be used to reduce the amount of training data required. Given the proposed features, the query-by-committee strategy outperforms other compared query strategies.

关键词

concept prerequisite learning; active learning

全文

PDF


Diagnosing University Student Subject Proficiency and Predicting Degree Completion in Vector Space

摘要

We investigate the issues of undergraduate on-time graduation with respect to subject proficiencies through the lens of representation learning, training a student vector embeddings from a dataset of 8 years of course enrollments. We compare the per-semester student representations of a cohort of undergraduate Integrative Biology majors to those of graduated students in subject areas involved in their degree requirements. The result is an embedding rich in information about the relationships between majors and pathways taken by students which encoded enough information to improve prediction accuracy of on-time graduation to 95%, up from a baseline of 87.3%. Challenges to preparation of the data for student vectorization and sourcing of validation sets for optimization are discussed.

关键词

Representation learning; Learning Analytics; Higher-education

全文

PDF


An E-Learning Recommender That Helps Learners Find the Right Materials

摘要

Learning materials are increasingly available on the Web making them an excellent source of information for building e-Learning recommendation systems. However, learners often have difficulty finding the right materials to support their learning goals because they lack sufficient domain knowledge to craft effective queries that convey what they wish to learn. The unfamiliar vocabulary often used by domain experts creates a semantic gap between learners and experts, and also makes it difficult to map a learner's query to relevant learning materials. We build an e-Learning recommender system that uses background knowledge extracted from a collection of teaching materials and encyclopedia sources to support the refinement of learners' queries. Our approach allows us to bridge the gap between learners and teaching experts. We evaluate our method using a collection of realistic learner queries and a dataset of Machine Learning and Data Mining documents. Evaluation results show our method to outperform benchmark approaches and demonstrates its effectiveness in assisting learners to find the right materials.

关键词

e-Learning; Recommender Systems; Background Knowledge; Query Refinement

全文

PDF


Predictive Modeling of Learning Continuation in Preschool Education Using Temporal Patterns of Development Tests

摘要

Learning analytics applies data analysis techniques to learning data in order to support students’ learning processes and to improve the quality of education. Despite the increasing attention to learning analytics for higher education, it has not been fully addressed in primary and preschool education. In this research, we apply learning analytics to preschool education to predict the continuation of learning of preschool children. Based on our hypothesis that temporal patterns in the assessment scores of development tests are effective features for prediction, we extract the temporal patterns using time-series clustering, and use them as the features of prediction models. The experimental results using a real preschool education dataset show that the use of the temporal patterns improves the predictive accuracy of future continuation of study.

关键词

全文

PDF


On the Importance of a Research Data Archive

摘要

As research becomes more and more data intensive, managing this data becomes a major challenge in any organization. At university level there is seldom a unified data management system in place. The general approach to storing data in such environments is to deploy network storage. Each member can store their data organized to their own likings in their dedicated location on the network. Additionally, users tend to store data in distributed manner such as on private devices, portable storage, or public and private repositories. Adding to this complexity, it is common for university departments to have high fluctuation of staff, resulting in major loss of information and data on an employee’s departure. A common scenario then is that it is known that certain data has already been created via experiments or simulation. However, it can not be retrieved, resulting in a repetition of generation, which is costly and time-consuming. Additionally, as of recent years, publishers and funding agencies insist on storing, sharing, and reusing existing research data. We show how digital preservation can help group leaders and their employees cope with these issues, by introducing our own archival system OntoRAIS.

关键词

Digital Preservation

全文

PDF


Addressing the Technical, Philosophical, and Ethical Issues of Artificial Intelligence Through Active Learning Class Assignments

摘要

Artificial intelligence (AI) is an extremely large and complex field technically, while at the same time it captures our imagination and prompts us to explore major philosophical and ethical questions concerning humanity and human intelligence. Teaching a course that does justice to all these aspects of the field is a big challenge. However, due to the increase in computational capability with a commensurate decrease in cost, a wealth of products and materials are available that can be used to provide students with rich, meaningful, and memorable experiences within the context of a primarily technical course in AI. Toys, articles, and movies can all be used to foster student exploration of key questions in the technical, philosophical, and ethical issues of AI.

关键词

AI courseware; multimedia; teaching AI; philosophical issues; ethical issues

全文

PDF


Introducing Machine Learning Concepts by Training a Neural Network to Recognize Hand Gestures

摘要

We present an interactive guided activity to introduce supervised learning by training a deep neural network (treated as a black box) to recognize "rock paper scissors" hand gestures from unconstrained images. The audience is actively involved in acquiring a varied and representative dataset, on which the rest of the activity is based. Covered concepts include the training/evaluation split, classifier evaluation, baseline accuracy, overfitting, generalization, data augmentation.

关键词

machine learning; computer vision; education; convolutional neural networks; deep learning

全文

PDF


Mighty Thymio for University-Level Educational Robotics

摘要

Thymio is a small, inexpensive, mass-produced mobile robot with widespread use in primary and secondary education. In order to make it more versatile and effectively use it in later educational stages, including university levels, we have expanded Thymio's capabilities by adding off-the-shelf hardware and open software components. The resulting robot, that we call Mighty Thymio, provides additional sensing functionalities, increased computing power, networking, and full ROS integration. We present the architecture of Mighty Thymio and show its application in advanced educational activities.

关键词

robotics; higher education

全文

PDF


A Driving License for Intelligent Systems

摘要

Artificial Intelligence (AI) is becoming increasingly important. Thus, sound knowledge about the principles of AI will be a crucial factor for future careers of young people as well as for the development of novel, innovative products. Addressing this challenge, we present an ambitious 3-year project focusing on developing and implementing a professional, internationally accepted, standardized training and certification system for AI which will also be recognized by the industry and educational institutions. The approach is based on already implemented and evaluated pilot projects in the area of AI education. The project’s main goal is to train and certify teachers and mentors as well as students and young people in basic and advanced AI topics, fostering AI literacy among this target audience.

关键词

全文

PDF


Introducing AI to Undergraduate Students via Computer Vision Projects

摘要

Computer vision, as a subfield in the general artificial intelligence (AI), is a technology can be visualized and easily found in a large number of state-of-art applications. In this project, undergraduate students performed research on a landmark recognition task using computer vision techniques. The project focused on analyzing, designing, configuring, and testing the two core components in landmark recognition: feature detection and description. The project modeled the landmark recognition system as a tour guide for visitors to the campus and evaluated the performance in the real world circumstances. By analyzing real-world data and solving problems, student’s cognitive skills and critical thinking skills were sharpened. Their knowledge and understanding in mathematical modeling and data processing were also enhanced.

关键词

Landmark recognition; Feature detection; Feature description; SURF

全文

PDF


Model AI Assignments 2018

摘要

The Model AI Assignments session seeks to gather and disseminate the best assignment designs of the Artificial Intelligence (AI) Education community. Recognizing that assignments form the core of student learning ex- perience, we here present abstracts of seven AI assign- ments from the 2018 session that are easily adoptable, playfully engaging, and flexible for a variety of instruc- tor needs.

关键词

全文

PDF


Clustering - What Both Theoreticians and Practitioners Are Doing Wrong

摘要

Unsupervised learning is widely recognized as one of the most important challenges facing machine learning nowadays. However, in spite of hundreds of papers on the topic being published every year, current theoretical understanding and practical implementations of such tasks, in particular of clustering, is very rudimentary. This note focuses on clustering. The first challenge I address is model selection---how should a user pick an appropriate clustering tool for a given clustering problem, and how should the parameters of such an algorithmic tool be tuned? In contrast with other common computational tasks, for clustering, different algorithms often yield drastically different outcomes. Therefore, the choice of a clustering algorithm may play a crucial role in the usefulness of an output clustering solution. However, currently there exists no methodical guidance for clustering tool selection for a given clustering task. I argue the severity of this problem and describe some recent proposals aiming to address this crucial lacuna.

关键词

clustering, theory, practice, bias, challenges

全文

PDF


Learning Constraints From Examples

摘要

While constraints are ubiquitous in artificial intelligence and constraints are also commonly used in machine learning and data mining, the problem of learning constraints from examples has received less attention. In this paper, we discuss the problem of constraint learning in detail, indicate some subtle differences with standard machine learning problems, sketch some applications and summarize the state-of-the-art.

关键词

Machine Learning; Constraint Satisfaction; Constraint Programming; Constraint Learning; Structured Output Prediction

全文

PDF


Computational Social Choice and Computational Complexity: BFFs?

摘要

We discuss the connection between computational social choice (comsoc) and computational complexity. We stress the work so far on, and urge continued focus on, two less-recognized aspects of this connection. Firstly, this is very much a two-way street: Everyone knows complexity classification is used in comsoc, but we also highlight benefits to complexity that have arisen from its use in comsoc. Secondly, more subtle, less-known complexity tools often can be very productively used in comsoc.

关键词

computational social choice; computational complexity; classification; search versus decision

全文

PDF


AI Meets Chemistry

摘要

We argue that chemistry should be the next grand challenge for Artificial Intelligence. The AI research community and humanity would benefit tremendously from focusing AI research on chemistry on a regular basis, as a benchmark as well as a real-world application domain. To support our position, we review the importance of chemical compound discovery and synthesis planning and discuss the properties of search spaces in a chemistry problem. Knowledge acquired in domains such as two-player board games or single-player puzzles places the AI community in a good position to solve critical problems in the chemistry domain. Yet, we show that searching in chemistry problems poses significant additional challenges that will have to be addressed. Finally, we envision how several AI areas like Natural Language Processing, Machine Learning, planning and search, are relevant for chemistry.

关键词

Machine learning; NLP; planning; search

全文

PDF


Learning Fast and Slow: Levels of Learning in General Autonomous Intelligent Agents

摘要

We propose two distinct levels of learning for general autonomous intelligent agents. Level 1 consists of fixed architectural learning mechanisms that are innate and automatic. Level 2 consists of deliberate learning strategies that are controlled by the agent's knowledge. We describe these levels and provide an example of their use in a task-learning agent. We also explore other potential levels and discuss the implications of this view of learning for the design of autonomous agents.

关键词

Learning; Cognitive Architecture; Agent Learning; Autonomous Agents; Human Learning

全文

PDF


Imagination Machines: A New Challenge for Artificial Intelligence

摘要

The aim of this paper is to propose a new overarching challenge for AI: the design of imagination machines. Imagination has been defined as the capacity to mentally transcend time, place, and/or circumstance. Much of the success of AI currently comes from a revolution in data science, specifically the use of deep learning neural networks to extract structure from data. This paper argues for the development of a new field called imagination science, which extends data science beyond its current realm of learning probability distributions from samples. Numerous examples are given in the paper to illustrate that human achievements in the arts, literature, poetry, and science may lie beyond the realm of data science, because they require abilities that go beyond finding correlations: for example, generating samples from a novel probability distribution different from the one given during training; causal reasoning to uncover interpretable explanations; or analogical reasoning to generalize to novel situations (e.g., imagination in art, representing alien life in a distant galaxy, understanding a story about talking animals, or inventing representations to model the large-scale structure of the universe). We describe the key challenges in automating imagination, discuss connections between ongoing research and imagination, and outline why automation of imagination provides a powerful launching pad for transforming AI.

关键词

Data science; machine learning; imagination

全文

PDF


Engineering Pro-Sociality With Autonomous Agents

摘要

This paper envisions a future where autonomous agents are used to foster and support pro-social behavior in a hybrid society of humans and machines. Pro-social behavior occurs when people and agents perform costly actions that benefit others. Acts such as helping others voluntarily, donating to charity, providing informations or sharing resources, are all forms of pro-social behavior. We discuss two questions that challenge a purely utilitarian view of human decision making and contextualize its role in hybrid societies: i) What are the conditions and mechanisms that lead societies of agents and humans to be more pro-social? ii) How can we engineer autonomous entities (agents and robots) that lead to more altruistic and cooperative behaviors in a hybrid society? We propose using social simulations, game theory, population dynamics, and studies with people in virtual or real environments (with robots) where both agents and humans interact. This research will constitute the basis for establishing the foundations for the new field of Pro-social Computing, aiming at understanding, predicting and promoting pro-sociality among humans, through artificial agents and multiagent systems.

关键词

Pro-social Agents

全文

PDF


A Brief History and Recent Achievements in Bidirectional Search

摘要

The state of the art in bidirectional search has changed significantly a very short time period; we now can answer questions about unidirectional and bidirectional search that until very recently we were unable to answer. This paper is designed to provide an accessible overview of the recent research in bidirectional search in the context of the broader efforts over the last 50 years. We give particular attention to new theoretical results and the algorithms they inspire for optimal and near-optimal node expansions when finding a shortest path.

关键词

bidirectional search; heuristic search

全文

PDF


FgER: Fine-Grained Entity Recognition

摘要

Fine-grained Entity Recognition (FgER) is the task of detecting and classifying entity mentions into more than 100 types. The type set can span various domains including biomedical (e.g., disease, gene), sport (e.g., sports event, sports player), religion and mythology (e.g., religion, god) and entertainment (e.g., movies, music). Most of the existing literature for Entity Recognition (ER) focuses on coarse-grained entity recognition (CgER), i.e., recognition of entities belonging to few types such as person, location and organization. In the past two decades, several manually annotated datasets spanning different genre of texts were created to facilitate the development and evaluation of CgER systems (Nadeau and Sekine 2007). The state-of-the-art CgER systems use supervised statistical learning models trained on manually annotated datasets (Ma and Hovy 2016). In contrast, FgER systems are yet to match the performance level of CgER systems. There are two major challenges associated with failure of FgER systems. First, manually annotating a large-scale multi-genre training data for FgER task is expensive, time-consuming and error-prone. Note that, a human-annotator will have to choose a subset of types from a large set of types and types for the same entity might differ in sentences based on the contextual information. Second, supervised statistical learning models when trained on automatically generated noisy training data fits to noise, impacting the model’s performance. The objective of my thesis is to create a FgER system by exploring an off the beaten path which can eliminate the need for manually annotating large-scale multi-genre training dataset. The path includes: (1) automatically generating a large-scale single-genre training dataset, (2) noise-aware learning models that learn better in noisy datasets, and (3) use of knowledge transfer approaches to adapt FgER system to different genres of text.

关键词

entity recognition; learning with noise; transfer learning

全文

PDF


Abstraction Sampling in Graphical Models

摘要

We present a new sampling scheme for approximating hard to compute queries over graphical models, such as computing the partition function. The scheme builds upon exact algorithms that traverse a weighted directed state-space graph representing a global function over a graphical model (e.g., probability distribution). With the aid of an abstraction function and randomization, the state space can be compacted (trimmed) to facilitate tractable computation, yielding a Monte Carlo estimate that is unbiased. We present the general idea and analyze its properties analytically and empirically.

关键词

graphical models; partition function; importance sampling; heuristic search

全文

PDF


Spatio-Temporal Model for Wildlife Poaching Prediction Evaluated Through a Controlled Field Test in Uganda

摘要

Worldwide, conservation agencies employ rangers to protect conservation areas from poachers. However, agencies lack the manpower to have rangers effectively patrol these vast areas frequently. While past work has modeled poachers’ behavior so as to aid rangers in planning future patrols, those models’ predictions were not validated by extensive field tests. In my thesis, I present a spatio-temporal model that predicts poaching threat levels and results from a five-month field test in Uganda’s Queen Elizabeth Protected Area (QEPA). To my knowledge, this is the first time that a predictive model has been evaluated through such an extensive field test in this domain. These field test will be extended to another park in Uganda, Murchison Fall Protected Area, shortly. Main goals of my thesis are to develop the best performing model in terms of speed and accuracy and use such model to generate efficient and feasible patrol routes for the park rangers.

关键词

Spatio-temporal predictive models, poaching forecasting, wildlife conservation

全文

PDF


Reasonableness Monitors

摘要

As we move towards autonomous machines responsible for making decisions previously entrusted to humans, there is an immediate need for machines to be able to explain their behavior and defend the reasonableness of their actions. To implement this vision, each part of a machine should be aware of the behavior of the other parts that they cooperate with. Each part must be able to explain the observed behavior of those neighbors in the context of the shared goal for the local community. If such an explanation cannot be made, it is evidence that either a part has failed (or was subverted) or the communication has failed. The development of reasonableness monitors is work towards generalizing that vision, with the intention of developing a system-construction methodology that enhances both robustness and security, at runtime (not static compile time), by dynamic checking and explaining of the behaviors of parts and subsystems for reasonableness in context.

关键词

Cognitive Systems; Symbolic AI;Common-Sense Reasoning

全文

PDF


Decomposition-Based Solving Approaches for Stochastic Constraint Optimisation

摘要

Combinatorial optimisation problems often contain uncertainty that has to be taken into account to produce realistic solutions. A common way to describe the uncertainty is by means of scenarios, where each scenario describes different potential sets of problem parameters based on random distributions or historical data. While efficient algorithmic techniques exist for specific problem classes such as linear programs, there are very few approaches that can handle general Constraint Programming formulations subject to uncertainty. The goal of my PhD is to develop generic methods for solving stochastic combinatorial optimisation problems formulated in a Constraint Programming framework.

关键词

Stochastic Constraint Programming, Constraint Satisfaction

全文

PDF


Probabilistic Planning With Influence Diagrams

摘要

Graphical models provide a powerful framework for reasoning under uncertainty, and an influence diagram (ID) is a graphical model of a sequential decision problem that maximizes the total expected utility of a non-forgetting agent. Relaxing the regular modeling assumptions, an ID can be flexibly extended to general decision scenarios involving a limited memory agent or multi-agents. The approach of probabilistic planning with IDs is expected to gain computational leverage by exploiting the local structure as well as representation flexibility of influence diagram frameworks. My research focuses on graphical model inference for IDs and its application to probabilistic planning, targeting online MDP/POMDP planning as testbeds in the evaluation.

关键词

influence diagram; probabilistic planning; online planning

全文

PDF


Guaranteed Plans for Multi-Robot Systems via Optimization Modulo Theories

摘要

Industries are on the brink of widely accepting a new paradigm for organizing production by having autonomous robots manage in-factory processes. This transition from static process chains towards more automation and autonomy poses new challenges in terms of, e.g., efficiency of production processes. The RoboCup Logistics League (RCLL) has been proposed as a realistic testbed to study the above mentioned problem at a manageable scale. In RCLL, teams of robots manage and optimize the material flow according to dynamic orders in a simplified factory environment. In particular, robots have to transport workpieces among several machines scattered around the factory shop floor. Each machine performs a specific processing step, orders that denote the products which must be assembled with these operations are posted at run-time and require quick planning and scheduling. Orders also come with a delivery time window, therefore introducing a temporal component into the problem. Though there exist successful heuristic approaches to solve the underlying planning and scheduling problems, a disadvantage of these methods is that they provide no guarantees about the quality of the solution. A promising solution to this problem is offered by the recently emerging field of Optimization Modulo Theories (OMT), where Satisfiability Modulo Theories (SMT) solving is extended with optimization functionalities. In this paper, we present an approach that combines bounded model checking and optimization to generate optimal controllers for multi-robot systems. In particular, using the RoboCup Logistics League as a testbed, we build formal models for robot motions, production processes, and for order schedules, deadlines and rewards. We then encode the synthesis problem as a linear mixed-integer problem and employ Optimization Modulo Theories to synthesize controllers with optimality guarantees.

关键词

OMT, robotics, planning

全文

PDF


Sequential Decision Making in Artificial Musical Intelligence

摘要

My main research motivation is to develop complete autonomous agents that interact with people socially. For an agent to be social with respect to humans, it needs to be able to parse and process the multitude of aspects that comprise the human cultural experience. That in itself gives rise to many fascinating learning problems. I am interested in tackling these fundamental problems from an empirical as well as a theoretical perspective. Music, as a general target domain, serves as an excellent testbed for these research ideas. Musical skills---playing music (alone or in a group), analyzing music or composing it---all involve extremely advanced knowledge representation and problem solving tools. Creating "musical agents"---agents that can interact richly with people in the music domain---is a challenge that holds the potential of advancing social agents research, and contributing important and broadly applicable AI knowledge. This belief is fueled not just by my background in computer science and artificial intelligence, but also by my deep passion for music as well as my extensive musical training. One key aspect of musical intelligence which hasn’t been sufficiently studied is that of sequential decision-making. My thesis strives to answer the following question: How can a sequential decision making perspective guide us in the creation of better music agents, and social agents in general? More specifically, this thesis focuses on two aspects of musical intelligence: music recommendation and multiagent interaction in the context of music.

关键词

reinforcement learning; computational musicology; music agents

全文

PDF


Adaptive and Dynamic Team Formation for Strategic and Tactical Planning

摘要

Past work in security games has mainly focused on the problem static resource allocation; how to optimally deploy a given fixed team of resources. My research aims to address the challenge of integrating operational planning into security games, where resources are heterogeneous and the defender is tasked with optimizing over both the investment into these resources, as well as their deployment in the field. This allows the defender to design more adaptive strategies, reason about the efficiency of their use of these resources as well as their effectiveness in their deployment. This thesis explores the challenges in integrating these two optimization problems in both the single stage and multi-stage setting and provides a formal model of this problem, which we refer to as the Simultaneous Optimization of Resource Teams and Tactics (SORT) as a new fundamental research problem in security games that combines strategic and tactical decision making. The main contributions of this work are solution methods to the SORT problem under various settings as well as exploring various types of tradeoffs that can arise in these settings. These include managing budget for investment in resources as well as capacity constraints on use of resources. My work addresses scenarios when the tactical decision problem (optimal deployment) is difficult, and thus evaluating the performance of any given team is difficult. Additionally, I address domains where we are tasked with making repeated strategic level decision and where, due to changing domain features, fluctuations in time dependent processes or the realization of uncertain parameters in the problem, it becomes necessary to re-evaluate and adapt to new information.

关键词

Game Theory; Planning;

全文

PDF


Complexity of Optimally Defending and Attacking a Network

摘要

We consider single-agent, two-agent turn-taking and cooperative security games with Inverse Geodesic Length (IGL) as utility metric. We focus on the single-agent vertex deletion problem corresponding to IGL called MinIGL. Specifically, given a graph G, a budget k and a target inverse geodesic length T, does there exist a subset of vertices S of size k such that by deleting S the graph induced on the remaining vertices in G has IGL at most T. We cite our recently published work to report the results on the computational and parameterized complexity of MinIGL. Furthermore, we briefly state the problems we are interested to study in future.

关键词

Network Analysis; Network Security; Network Vulnerability; Fixed Parameter Tractability; Security Games

全文

PDF


Constraint Satisfaction Techniques for Combinatorial Problems

摘要

The last two decades have seen extraordinary advances in industrial applications of constraint satisfaction techniques, while combinatorial problems have been pushed to the sidelines. We propose a comprehensive analysis of the state of the art in constraint satisfaction problems when applied to combinatorial problems in areas such as graph theory, set theory, algebra, among others. We believe such a study will provide us with a deeper understanding about the limitations we still face in constraint satisfaction problems.

关键词

CSP; SAT; AllSAT

全文

PDF


Reading With Robots: Towards a Human-Robot Book Discussion System for Elderly Adults

摘要

As people age, it is critical that they maintain not only their physical health, but also their cognitive health―for instance, by engaging in cognitive exercise. Recent advancements in AI have uncovered novel ways through which to facilitate such exercise. In this thesis, I propose the first human-robot dialogue system designed specifically to promote cognitive exercise in elderly adults, through discussions about interesting metaphors in books. I describe my work to date, including the development of a new, large corpus and an approach for automatically scoring metaphor novelty. Finally, I outline my plans for incorporating this work into the proposed system.

关键词

metaphor; metaphor novelty; figurative language; natural language processing; human-robot interaction; dialogue systems

全文

PDF


Cross-Lingual Learning With Distributed Representations

摘要

Cross-lingual Learning can help to bring state-of-the-art deep learning solutions to smaller languages. These languages in general lack resource for training advanced neural networks. With transfer of knowledge across languages we can improve the results for various NLP tasks.

关键词

crosslingual learning; multilingual learning; transfer learning; distributed representations

全文

PDF


Game-Theoretic Threat Screening and Deceptive Techniques for Cyber Defense

摘要

My research addresses the problem faced by a defender who must screen objects for potential threats that are coming into a secure area. The particular domain of interest for my work is the protection of cyber networks from intrusions given the presence of a strategic adversary. My thesis work allows fora defender to use game-theoretical methods that randomize her protection strategy and introduces uncertainty to the adversary that makes it more difficult to attack the defender’s network successfully.

关键词

Security Games; Threat Screening

全文

PDF


Hierarchical Methods for a Unified Approach to Discourse, Domain, and Style in Neural Conversational Models

摘要

With the advent of personal assistants such as Siri and Alexa, there has been a renewed focus on dialog systems, specifically open domain conversational agents. Dialog is a challenging problem since it spans multiple conversational turns. To further complicate the problem, there are many contextual cues and valid possible utterances. Dialog is fundamentally a multiscale process given that context is carried from previous utterances in the conversation; however, current neural methods lack the ability to carry human-like conversation. Neural dialog models are based on recurrent neural network Encoder-Decoder sequence-to-sequence models (Sutskever, Vinyals, and Le, 2014; Bahdanau, Cho, and Bengio, 2015). However, these models lack the ability to create temporal and stylistic coherence in conversations. We propose to incorporate dialog acts (such as Statement-non-opinion ["Me, I'm in the legal department."], Acknowledge ["Uh-huh."]) and discourse connectives (e.g. "because," "then"), utterance clustering and domain prediction, and style shifting using hierarchical methods. In particular, we show that clustering of utterance representations automatically allows for a unified hierarchical approach to discourse, domain, and style.

关键词

全文

PDF


Efficiency and Safety in Autonomous Vehicles Through Planning With Uncertainty

摘要

Autonomous vehicles are quickly becoming an important part of human society for transportation, monitoring, agriculture, and other applications. In these applications, there is a fundamental tradeoff between safety and efficiency that is especially salient when the autonomous vehicles interact directly with humans. A key to maintaining safety without sacrificing efficiency is dealing with uncertainty properly so that robots can be assertive when it is appropriate and careful in dangerous situations. The research that will be presented in my thesis uses the partially observable Markov decision process framework to approach this challenge, exploring several applications and proposing a new solution approach that is able to handle continuous action and observation spaces, a qualitative improvement over current methods.

关键词

Sequential Decision Making; Uncertainty in AI; Localization, Mapping, and Navigation; Human-Robot Interaction; Motion and Path Planning

全文

PDF


Identifying Private Content for Online Image Sharing

摘要

I present the outline of my dissertation work, Identifying Private Content for Online Image Sharing. Particularly, in my dissertation, I explore learning models to predict appropriate binary privacy settings (i.e., private, public) for images, before they are shared online. Specifically, I investigate textual features (user-annotated tags and automatically derived tags), and visual semantic features that are transferred from various layers of deep Convolutional Neural Network (CNN). Experimental results show that the learning models based on the proposed features outperform strong baseline models for this task on the Flickr dataset of thousands of images.

关键词

deep visual features; deep convolutional neural network; privacy setting prediction; social networking site; image privacy

全文

PDF


Enhancing Machine Learning Classification for Electrical Time Series Applications

摘要

Machine learning applications to electrical time series data will have wide-ranging impacts in the near future. Electricity disaggregation holds the promise of reducing billions of dollars of electrical waste every year. In the power grid, automatic classification of disturbance events detected by phasor measurement units could prevent cascading blackouts before they occur. Additional applications include better market segmentation by utility companies, improved design of appliances, and reliable incorporation of renewable energy resources into the power grid. However, existing machine learning methods remain unimplemented in the real world because of limiting assumptions that hinder performance. My research contributions are summarized as follows: In electricity disaggregation, I introduced the first label correction approach for supervised training samples. For unsupervised disaggregation, I introduced event detection that does not require parameter tuning and appliance discovery that makes no assumptions on appliance types. These improvements produce better accuracy, faster computation, and more scalability than any previously introduced method and can be to applied natural gas disaggregation, water disaggregation, and other source separation domains. My current work challenges long-held assumptions in time series shapelets, a classification tool with applicability in electrical time series and dozens of additional domains.

关键词

Supervised Learning; Unsupervised Learning; Time Series; Data Streams; Machine Learning Applications; Electricity Disaggregation; Phasor Measurement Units; Source Separation; Classification

全文

PDF


Building More Explainable Artificial Intelligence With Argumentation

摘要

Currently, much of machine learning is opaque, just like a "black box." However, in order for humans to understand, trust and effectively manage the emerging AI systems, an AI needs to be able to explain its decisions and conclusions. In this paper, I propose an argumentation-based approach to explainable AI, which has the potential to generate more comprehensive explanations than existing approaches.

关键词

Explainable AI; Argumentation; Explanation

全文

PDF


Plan-Based Intention Revision

摘要

Plan-based story generation has operationalized concepts from the Belief-Desire-Intention (BDI) theory of mind to create goal-driven character agents with explainable behavior. However, these character agents are limited in that they do not capture the dynamic nature of intentions. To address this limitation, we define a plan-based intention revision model and propose an evaluation using the QUEST cognitive model to assess the explainability of an intention revision.

关键词

intention;BDI agents;story planning

全文

PDF


Training Autoencoders in Sparse Domain

摘要

Autoencoders (AE) are essential in learning representation of large data (like images) for dimensionality reduction. Images are converted to sparse domain using transforms like Fast Fourier Transform (FFT) or Discrete Cosine Transform (DCT) where information that requires encoding is minimal. By optimally selecting the feature-rich frequencies, we are able to learn the latent vectors more robustly. We successfully show enhanced performance of autoencoders in sparse domain for images.

关键词

Autoencoders; Neural networks; Fast Fourier Transform; Discrete Cosine Transform

全文

PDF


Learning to Detect Pointing Gestures From Wearable IMUs

摘要

We propose a learning-based system for detecting when a user performs a pointing gesture, using data acquired from IMU sensors, by means of a 1D convolutional neural network. We quantitatively evaluate the resulting detection accuracy, and discuss an application to a human-robot interaction task where pointing gestures are used to guide a quadrotor landing.

关键词

全文

PDF


Proposition Entailment in Educational Applications Using Deep Neural Networks

摘要

To have a more meaningful impact, educational applications need to significantly improve the way feedback is offered to teachers and students. We propose two methods for determining propositional-level entailment relations between a reference answer and a student's response. Both methods, one using hand-crafted features and an SVM and the other using word embeddings and deep neural networks, achieve significant improvements over a state-of-the-art system and two alternative approaches.

关键词

educational applications; deep neural networks; machine learning; word embeddings; entailment

全文

PDF


Conditional Linear Regression

摘要

Previous work in machine learning and statistics commonly focuses on building models that capture the vast majority of data, possibly ignoring a segment of the population as outliers. By contrast, we may be interested in finding a segment of the population for which we can find a linear rule capable of achieving more accurate predictions. We give an efficient algorithm for the conditional linear regression task, which is the joint task of identifying a significant segment of the population, described by a k-DNF, along with its linear regression fit.

关键词

全文

PDF


FR-ANet: A Face Recognition Guided Facial Attribute Classification Network

摘要

In this paper, we study the problem of facial attribute learning. In particular, we propose a Face Recognition guided facial Attribute classification Network, called FR-ANet. All the attributes share low-level features, while high-level features are specially learned for attribute groups. Further, to utilize the identity information, high-level features are merged to perform face identity recognition. The experimental results on CelebA and LFWA datasets demonstrate the promise of the FR-ANet.

关键词

全文

PDF


A Stratified Feature Ranking Method for Supervised Feature Selection

摘要

Most feature selection methods usually select the highest rank features which may be highly correlated with each other. In this paper, we propose a Stratified Feature Ranking (SFR) method for supervised feature selection. In the new method, a Subspace Feature Clustering (SFC) is proposed to identify feature clusters, and a stratified feature ranking method is proposed to rank the features such that the high rank features are lowly correlated. Experimental results show the superiority of SFR.

关键词

全文

PDF


Selecting Proper Multi-Class SVM Training Methods

摘要

Support Vector Machines (SVMs) are excellent candidate solutions to solving multi-class problems, and multi-class SVMs can be trained by several different methods. Different training methods commonly produce SVMs with different effectiveness, and no multi-class SVM training method always outperforms other multi-class SVM training methods on all problems. This raises difficulty for practitioners to choose the best training method for a given problem. In this work, we propose a Multi-class Method Selection (MMS) approach to help users select the most appropriate method among one-versus-one (OVO), one-versus-all (OVA) and structural SVMs (SSVMs) for a given problem. Our key idea is to select the training method based on the distribution of training data and the similarity between different classes. Using the distribution and class similarity, we estimate the unclassifiable rate of each multi-class SVM training method, and select the training method with the minimum unclassifiable rate. Our initial findings show: (i) SSVMs with linear kernel perform worse than OVO and OVA; (ii) MMS often produces SVM classifiers that can confidently classify unseen instances.

关键词

Multi-class SVMs; one-versus-one; one-versus-all; structural SVMs

全文

PDF


Negative-Aware Influence Maximization on Social Networks

摘要

How to minimize the impact of negative users within the maximal set of influenced users? The Influenced Maximization (IM) is important for various applications. However, few studies consider the negative impact of some of the influenced users.We propose a negative-aware influence maximization problem by considering users' negative impact. A novel algorithm is proposed to solve the problem. Experiments on real-world datasets show the proposed algorithm can achieve 70% improvement on average in expected influence compared with rivals.

关键词

Influence Maximization; Social Network; Negative Users

全文

PDF


Visual Recognition in Very Low-Quality Settings: Delving Into the Power of Pre-Training

摘要

Visual recognition from very low-quality images is an extremely challenging task with great practical values. While deep networks have been extensively applied to low-quality image restoration and high-quality image recognition tasks respectively, few works have been done on the important problem of recognition from very low-quality images.This paper presents a degradation-robust pre-training approach on improving deep learning models towards this direction. Extensive experiments on different datasets validate the effectiveness of our proposed method.

关键词

neural network; deep learning; image recognition

全文

PDF


"Did I Say Something Wrong?": Towards a Safe Collaborative Chatbot

摘要

Chatbots have been a core measure of AI since Turing has presented his test for intelligence, and are also widely used for entertainment purposes. In this paper we present a platform that enables users to collaboratively teach a chatbot responses, using natural language. We present a method of collectively detecting malicious users and using the commands taught by these users to further mitigate activity of future malicious users.

关键词

Instructable agent;Personal assistant;Human-agent interaction

全文

PDF


Preliminary Results on Exploration-Driven Satisfiability Solving

摘要

In this abstract, we present our study of exploring the SAT search space via random-sampling, with the goal of improving Conflict Directed Clause Learning (CDCL) SAT solvers. Our proposed CDCL SAT solving algorithm expSAT uses a novel branching heuristic expVSIDS. It combines the standard VSIDS scores with heuristic scores derived from exploration. Experiments with application benchmarks from recent SAT competitions demonstrate the potential of the expSAT approach for improving CDCL SAT solvers.

关键词

Satisfiability Solving; SAT; Exploration; Random Walk; Monte-Carlo Tree Search; Reinforcement Learning

全文

PDF


Multi-Label Community-Based Question Classification via Personalized Sequence Memory Network Learning

摘要

Multi-label community-based question classification is a challenging problem in Community-based Question Answering (CQA) services, arising in many real applications such as question navigation and expert finding. Most of the existing approaches consider the problem as content-based tag suggestion task, which suffers from the textual sparsity issue. Unlike the previous studies, we consider the problem of multi-label community-based question classification from the viewpoint of personalized sequence learning. We introduce the personalized sequence memory network that leverages not only the semantics of questions but also the personalized information of askers to provide the sequence tag learning function to capture the high-order tag dependency. The experiment on real-world dataset shows the effectiveness of our method.

关键词

question tagging; memory network; multi-label

全文

PDF


Adversarial Goal Generation for Intrinsic Motivation

摘要

Generally in Reinforcement Learning the goal, or reward signal, is given by the environment and cannot be controlled by the agent. We propose to introduce an intrinsic motivation module that will select a reward function for the agent to learn to achieve. We will use a Universal Value Function Approximator, that takes as input both the state and the parameters of this reward function as the goal to predict the value function (or action-value function) to generalize across these goals. This module will be trained to generate goals such that the agent's learning is maximized. Thus, this is also a method for automatic curriculum learning.

关键词

Reinforcement Learning; Deep Learning; Curriculum Learning; Intrinsic Motivation

全文

PDF


Deep Modeling of Social Relations for Recommendation

摘要

Social-based recommender systems have been recently proposed by incorporating social relations of users to alleviate sparsity issue of user-to-item rating data and to improve recommendation performance. Many of these social-based recommender systems linearly combine the multiplication of social features between users. However, these methods lack the ability to capture complex and intrinsic non-linear features from social relations. In this paper, we present a deep neural network based model to learn non-linear features of each user from social relations, and to integrate into probabilistic matrix factorization for rating prediction problem. Experiments demonstrate the advantages of the proposed method over state-of-the-art social-based recommender systems.

关键词

Recommender Systems; Social Relations; Rating Prediction; Deep Learning

全文

PDF


Learning Feature Representations for Keyphrase Extraction

摘要

In supervised approaches for keyphrase extraction, a candidate phrase is encoded with a set of hand-crafted features and machine learning algorithms are trained to discriminate keyphrases from non-keyphrases. Although the manually-designed features have shown to work well in practice, feature engineering is a difficult process that requires expert knowledge and normally does not generalize well. In this paper, we present SurfKE, a feature learning framework that exploits the text itself to automatically discover patterns that keyphrases exhibit. Our model represents the document as a graph and automatically learns feature representation of phrases. The proposed model obtains remarkable improvements in performance over strong baselines.

关键词

keyphrase extraction; keyphrase embeddings; feature learning; graph representation

全文

PDF


A Framework for Evaluating Barriers to the Democratization of Artificial Intelligence

摘要

The "democratization" of AI has been taken up as a primary goal by several major tech companies. However, these efforts resemble earlier "freeware" and "open access" initiatives, and it is unclear how or whether they are informed by political conceptions of democratic governance. A political formulation of the democratization of AI is thus necessary. This paper presents a framework for the democratic governance of technology through intelligent trial and error (ITE) that can be utilized to evaluate barriers to the democratization of AI and suggest strategies for overcoming them.

关键词

artificial intelligence; democracy; governance; democratization of AI

全文

PDF


AdGAP: Advanced Global Average Pooling

摘要

Global average pooling (GAP) has been used previously to generate class activation maps. The motivation behind AdGAP comes from the fact that the convolutional filters possess position information of the essential features and hence, combination of the feature maps could help us locate the class instances in an image. Our novel architecture generates promising results and unlike previous methods, the architecture is not sensitive to the size of the input image, thus promising wider application.

关键词

Global average pooling; Convolutional neural networks; MNIST; Feature maps; Class activation maps

全文

PDF


Enhancing RNN Based OCR by Transductive Transfer Learning From Text to Images

摘要

This paper presents a novel approach for optical character recognition (OCR) on acceleration and to avoid underfitting by text. Previously proposed OCR models typically take much time in the training phase and require large amount of labelled data to avoid underfitting. In contrast, our method does not require such condition. This is a challenging task related to transferring the character sequential relationship from text to OCR. We build a model based on transductive transfer learning to achieve domain adaptation from text to image. We thoroughly evaluate our approach on different datasets, including a general one and a relatively small one. We also compare the performance of our model with the general OCR model on different circumstances. We show that (1) our approach accelerates the training phase 20-30% on time cost; and (2) our approach can avoid underfitting while model is trained on a small dataset.

关键词

transductive transfer learning; OCR; text

全文

PDF


Bayesian Optimization Meets Search Based Optimization: A Hybrid Approach for Multi-Fidelity Optimization

摘要

Many real-life problems require optimizing functions with expensive evaluations. Bayesian Optimization (BO) and Search-based Optimization (SO) are two broad families of algorithms that try to find the global optima of a function with the goal of minimizing the number of function evaluations. A large body of existing work deals with the single-fidelity setting, where function evaluations are very expensive but accurate. However, in many applications, we have access to multiple-fidelity functions that vary in their cost and accuracy of evaluation. In this paper, we propose a novel approach called Multi-fidelity Hybrid (MF-Hybrid) that combines the best attributes of both BO and SO methods to discover the global optima of a black-box function with minimal cost. Our experiments on multiple benchmark functions show that the MF-Hybrid algorithm outperforms existing single-fidelity and multi-fidelity optimization algorithms.

关键词

Global Optimization; Bayesian Optimization; Multi-fidelity Optimization

全文

PDF


Towards Experienced Anomaly Detector Through Reinforcement Learning

摘要

This abstract proposes a time series anomaly detector which 1) makes no assumption about the underlying mechanism of anomaly patterns, 2) refrains from the cumbersome work of threshold setting for good anomaly detection performance under specific scenarios, and 3) keeps evolving with the growth of anomaly detection experience. Essentially, the anomaly detector is powered by the Recurrent Neural Network (RNN) and adopts the Reinforcement Learning (RL) method to achieve the self-learning process. Our initial experiments demonstrate promising results of using the detector in network time series anomaly detection problems.

关键词

Time Series Anomaly Detection; Recurrent Neural Network; Reinforcement Learning

全文

PDF


Dynamic Detection of Communities and Their Evolutions in Temporal Social Networks

摘要

In this paper, we propose a novel community detection model, which explores the dynamic community evolutions in temporal social networks by modeling temporal affiliation strength between users and communities. Instead of transforming dynamic networks into static networks, our model utilizes normal distribution to estimate the change of affiliation strength more concisely and comprehensively. Extensive quantitative and qualitative evaluation on large social network datasets shows that our model achieves improvements in terms of prediction accuracy and reveals distinctive insight about evolutions of temporal social networks.

关键词

Social Network; Community Detection; Temporal Social Network; Community Evolution;

全文

PDF


StackReader: An RNN-Free Reading Comprehension Model

摘要

Machine comprehension of text is the problem to answer a query based on a given context. Many existing systems use RNN-based units for contextual modeling linked with some attention mechanisms. In this paper, however, we propose StackReader, an end-to-end neural network model, to solve this problem, without recurrent neural network (RNN) units and its variants. This simple model is based solely on attention mechanism and gated convolutional neural network. Experiments on SQuAD have shown to have relatively high accuracy with a significant decrease in training time.

关键词

Reading Comprehension; Natural Language Processing; Attention

全文

PDF


Generating Image Captions in Arabic Using Root-Word Based Recurrent Neural Networks and Deep Neural Networks

摘要

Automatic caption generation of an image requires both computer vision and natural language processing techniques. Despite of advanced research in English caption generation, research on generating Arabic descriptions of an image is extremely limited. Semitic languages like Arabic are heavily influenced by root-words. We leverage this critical dependency of Arabic and in this paper are the first to generate captions of an image directly in Arabic using root-word based Recurrent Neural Networks and Deep Neural Networks. We report the first BLEU score for direct Arabic caption generation. Experimental results confirm that generating image captions using root-words directly in Arabic significantly outperforms the English-Arabic translated captions using state-of-the-art methods.

关键词

deep learning; computer vision; machine learning

全文

PDF


Contextual Collaborative Filtering for Student Response Prediction in Mixed-Format Tests

摘要

The purpose of this study is to design a machine learning approach to predict the student response in mixed-format tests. Particularly, a novel contextual collaborative filtering model is proposed to extract latent factors for students and test items, by exploiting the item information. Empirical results from a simulation study validate the effectiveness of the proposed method.

关键词

Collaborative Filtering; Student Response Prediction; Item Response Theory

全文

PDF


Learning Abduction Under Partial Observability

摘要

Our work extends Juba’s formulation of learning abductive reasoning from examples, in which both the relative plausibility of various explanations, as well as which explanations are valid, are learned directly from data. We extend the formulation to consider partially observed examples, along with declarative background knowledge about the missing data. We show that it is possible to use implicitly learned rules together with the explicitly given declarative knowledge to support hypotheses in the course of abduction. We observe that when a small explanation exists, it is possible to obtain a much-improved guarantee in the challenging exception-tolerant setting.

关键词

Adductive Reasoning; Knowledge Acquisition; Common-Sense Reasoning

全文

PDF


Identifying Emotional Support in Online Health Communities

摘要

Extracting emotional support in Online Health Communities provides insightful information about patients’ emotional states. Current computational approaches to identifying emotional messages, i.e., messages that contain emotional support, are typically based on a set of handcrafted features. In this paper, we show that high-level and abstract features derived from a combination of convolutional neural networks (CNN) with Long Short Term Memory (LSTM) networks can be successfully employed for emotional message identification and can obviate the need for handcrafted features.

关键词

Emotional Support Identification, Online Health Communities, Deep Learning, Convolutional LSTM

全文

PDF


Skyline Computation for Low-Latency Image-Activated Cell Identification

摘要

High-throughput label-free single cell screening technology has been studied for noninvasive analysis of various kinds of cells. We tackle the cell identification task in the cell sorting system as a continuous skyline computation. Skyline Computation is a method for extracting interesting entries from a large population with multiple attributes. Jointed rooted-tree (JR-tree) is continuous skyline computation algorithm that manages entries using a rooted-tree structure. JR-tree delays extend the tree to deeper levels to accelerate tree construction and traversal. In this study, we proposed the JR-tree-based parallel skyline computation accelerator. We implemented it on a field-programmable gate array (FPGA). We evaluated our proposed software and hardware algorithms against an existing software algorithm using synthetic and real-world datasets.

关键词

全文

PDF


Consonant-Vowel Sequences as Subword Units for Code-Mixed Languages

摘要

In this research work, we develop a state-of-art model for identifying sentiment in Hindi-English code-mixed language. We introduce new phonemic sub-word units for Hindi-English code-mixed text along with a hierarchical deep learning model which uses these sub-word units for predicting sentiment. The results indicate that the model yields a significant increase in accuracy as compared to other models.

关键词

Code Mixing; Deep Learning

全文

PDF


Sentiment Lexicon Enhanced Attention-Based LSTM for Sentiment Classification

摘要

Deep neural networks have gained great success recently for sentiment classification. However, these approaches do not fully exploit the linguistic knowledge. In this paper, we propose a novel sentiment lexicon enhanced attention-based LSTM (SLEA-LSTM) model to improve the performance of sentence-level sentiment classification. Our method successfully integrates sentiment lexicon into deep neural networks via single-head or multi-head attention mechanisms. We conduct extensive experiments on MR and SST datasets. The experimental results show that our model achieved comparable or better performance than the state-of-the-art methods.

关键词

sentiment classification; multi-head attention; sentiment lexicon

全文

PDF


NuMWVC: A Novel Local Search for Minimum Weighted Vertex Cover Problem

摘要

The minimum weighted vertex cover (MWVC) problem is a well known combinatorial optimization problem with important applications. This paper introduces a novel local search algorithm called NuMWVC for MWVC based on three ideas. First, four reduction rules are introduced during the initial construction phase. Second, the configuration checking with aspiration is proposed to reduce cycling problem. Moreover, a self-adaptive vertex removing strategy is proposed to save time.

关键词

minimum weighted vertex cover; local search; reduction rules; configuration checking with aspiration; self-adaptive vertex removing strategy

全文

PDF


Generative Adversarial Network for Abstractive Text Summarization

摘要

In this paper, we propose an adversarial process for abstractive text summarization, in which we simultaneously train a generative model G and a discriminative model D. In particular, we build the generator G as an agent of reinforcement learning, which takes the raw text as input and predicts the abstractive summarization. We also build a discriminator which attempts to distinguish the generated summary from the ground truth summary. Extensive experiments demonstrate that our model achieves competitive ROUGE scores with the state-of-the-art methods on CNN/Daily Mail dataset. Qualitatively, we show that our model is able to generate more abstractive, readable and diverse summaries.

关键词

abstractive text summarization

全文

PDF


A Novel Embedding Method for News Diffusion Prediction

摘要

News diffusion prediction aims to predict a sequence of news sites which will quote a particular piece of news. Most of previous propagation models make efforts to estimate propagation probabilities along observed links and ignore the characteristics of news diffusion processes, and they fail to capture the implicit relationships between news sites. In this paper, we propose an algorithm to model the news diffusion processes in a continuous space and take the attributes of news into account. Experiments performed on a real-world news dataset show that our model can take advantage of news’ attributes and predict news diffusion accurately.

关键词

全文

PDF


Imitation Upper Confidence Bound for Bandits on a Graph

摘要

We consider a graph of interconnected agents implementing a common policy and each playing a bandit problem with identical reward distributions. We restrict the information propagated in the graph such that agents can uniquely observe each other's actions. We propose an extension of the Upper Confidence Bound (UCB) algorithm to this setting and empirically demonstrate that our solution improves the performance over UCB according to multiple metrics and within various graph configurations.

关键词

Bandits; Bandit Problems; Multiagent Learning; Reinforcement Learning; Statistical Learning

全文

PDF


Semantic Understanding for Contextual In-Video Advertising

摘要

With the increasing consumer base of online video content, it is important for advertisers to understand the video context when targeting video ads to consumers. To improve the consumer experience and quality of ads, key factors need to be considered such as (i) ad relevance to video content (ii) where and how video ads are placed, and (iii) non-intrusive user experience. We propose a framework to semantically understand the video content for better ad recommendation that ensure these criteria.

关键词

contextual advertising, video semantic understanding, multimedia video advertising, computer vision, deep learning, machine learning

全文

PDF


Decision Making Over Combinatorially-Structured Domains

摘要

We consider a scenario where a user must make a set of correlated decisions and we propose a computational modeling of the deliberation process. We assume the user compactly expresses her preferences via soft constraints. We consider a sequential procedure that uses Decision Field Theory to model the decision making on each variable. We test this procedure on randomly generated tree-shaped Fuzzy Constraint Satisfaction Problems. Our preliminary results showed that the time increases almost in the number of nodes. This is promising in terms of modeling decision over exponentially large domains. In the future, we plan to compare our results non-sequential approach and with behavioral data to asses our approach both in terms of modeling human decision making over complex domains, and adopting DFT as a means of incorporating a form of uncertainty into the soft constraint formalism.

关键词

Preferences

全文

PDF


Balancing Lexicographic Fairness and a Utilitarian Objective With Application to Kidney Exchange

摘要

In this work, we close an open theoretical problem regarding the price of fairness in modern kidney exchanges. We then propose a hybrid fairness rule that balances a lexicographic preference ordering over agents, with a utilitarian objective. This rule has one parameter which controls a bound on the price of fairness. We apply this rule to real data from a large kidney exchange and show that our hybrid rule produces more reliable outcomes than other fairness rules.

关键词

fair division; multiagent systems; game theory

全文

PDF


Towards Neural Speaker Modeling in Multi-Party Conversation: The Task, Dataset, and Models

摘要

In this paper, we address the problem of speaker classification in multi-party conversation, and collect massive data to facilitate research in this direction. We further investigate temporal-based and content-based models of speakers, and propose several hybrids of them. Experiments show that speaker classification is feasible, and that hybrid models outperform each single component.

关键词

Dialog systems; Speaker modeling

全文

PDF


Exploring the Use of Shatter for AllSAT Through Ramsey-Type Problems

摘要

In the context of SAT solvers, Shatter is a popular tool for symmetry breaking on CNF formulas. Nevertheless, little has been said about its use in the context of AllSAT problems. AllSAT has gained much popularity in recent years due to its many applications in domains like model checking, data mining, etc. One example of a particularly transparent application of AllSAT to other fields of computer science is computational Ramsey theory. In this paper we study the effect of incorporating Shatter to the workflow of using Boolean formulas to generate all possible edge colorings of a graph avoiding prescribed monochromatic subgraphs. We identify two drawbacks in the naïve use of Shatter to break the symmetries of Boolean formulas encoding Ramsey-type problems for graphs.

关键词

SAT; AllSAT; Shatter; Symmetry breaking; Ramsey

全文

PDF


Constructing Hierarchical Bayesian Networks With Pooling

摘要

Inspired by the Bayesian brain hypothesis and deep learning, we develop a Bayesian autoencoder, a method of constructing recognition systems using a Bayesian network. We construct hierarchical Bayesian networks based on feature extraction and implement pooling to achieve invariance within a Bayesian network framework. The constructed networks propagate information bidirectionally between layers. We expect they will be able to achieve brain-like recognition using local features and global information such as their environments.

关键词

Machine Learning; Pattern Recognition; Bayesian Network; Bayesian Brain Hypothesis;

全文

PDF


Goal Recognition in Incomplete Domain Models

摘要

Recent approaches to goal recognition have progressively relaxed the assumptions about the amount and correctness of domain knowledge and available observations, yielding accurate and efficient algorithms. These approaches, however, assume completeness and correctness of the domain theory against which their algorithms match observations: this is too strong for most real-world domains. In this work, we develop a goal recognition technique capable of recognizing goals using incomplete (and possibly incorrect) domain theories.

关键词

Goal Recognition; Incomplete Domain Models; Landmarks

全文

PDF


Playing SNES Games With NeuroEvolution of Augmenting Topologies

摘要

Teaching a computer to play video games has generally been seen as a reasonable benchmark for developing new AI techniques. In recent years, extensive research has been completed to develop reinforcement learning (RL) algorithms to play various Atari 2600 games, resulting in new applications of algorithms such as Deep Q-Learning or Policy Gradient that outperform humans. However, games from Super Nintendo Entertainment System (SNES) are far more complicated than Atari 2600 games as many of these state-of-the-art algorithms still struggle to perform on this platform. In this paper, we present a new platform to research algorithms on SNES games and investigate NeuroEvolution of Augmenting Topologies (NEAT) as a possible approach to develop algorithms that outperform humans in SNES games.

关键词

NEAT; Neuroevolution; genetic algorithm; neural network

全文

PDF


Automated Question Answering System for Community-Based Questions

摘要

We present our attempt at developing an efficient Question Answering system for both factoid and non-factoid questions from any domain. Empirical evaluation of our system using multiple datasets demonstrates that our system outperforms the best system from the TREC LiveQA tracks, while keeping the response time to under less than half a minute.

关键词

Question Answering; Query formulation; Answer ranking; Open-domain

全文

PDF


Memory Management With Explicit Time in Resource-Bounded Agents

摘要

The objective of my research project is the formal treatment of memory issues in Intelligent Software Agents. I extend recent work which proposed a (partial) formalization of SOAR architecture in modal logic, reasoning on a particular type of agents: resource-bounded agents. I introduce explicit treatment of time instants and time intervals by means of Metric Temporal Logic, both in the background logic and in mental operations.

关键词

Intelligent Software Agents, Memory Management, Metric Temporal Logic

全文

PDF


Comparing Reward Shaping, Visual Hints, and Curriculum Learning

摘要

Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum. Yet few studies examine how they compare to each other, when one might prefer one approach, or how they may complement each other. As a first step in this direction, we compare reward shaping, hints, and curricula for a Deep RL agent in the game of Minecraft. We seek to answer whether reward shaping, visual hints, or the curricula have the most impact on performance, which we measure as the time to reach the target, the distance from the target, the cumulative reward, or the number of actions taken. Our analyses show that performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, the results suggest that designing an effective curriculum and providing appropriate hints most improve the performance. Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum, yet few studies examine how they compare to each other. We compare these approaches for a Deep RL agent in the game of Minecraft and show performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, this suggests that designing an effective curriculum with hints most improve the performance.

关键词

Curriculum Learning, Reward Shaping, Task Transfer

全文

PDF


Adversary Is the Best Teacher: Towards Extremely Compact Neural Networks

摘要

With neural networks rapidly becoming deeper, there emerges a need for compact models. One popular approach for this is to train small student networks to mimic larger and deeper teacher models, rather than directly learn from the training data. We propose a novel technique to train student-teacher networks without directly providing label information to the student. However, our main contribution is to learn how to learn from the teacher by a unique strategy---having the student compete with a discriminator.

关键词

deep learning; knowledge; distillation; compression; GAN

全文

PDF


Influence Maximization for Social Network Based Substance Abuse Prevention

摘要

Substance use and abuse is a significant public health problem in the United States. Group-based intervention programs offer a promising means of reducing substance abuse. While effective, inappropriate intervention groups can result in an increase in deviant behaviors among participants, a process known as deviancy training. In this paper, we present GUIDE, an AI-based decision aid that leverages social network information to optimize the structure of the intervention groups.

关键词

AI; Substance Abuse; Social Good; Algorithms

全文

PDF


Rating Super-Resolution Microscopy Images With Deep Learning

摘要

With super-resolution optical microscopy, it is now possible to observe molecular mechanisms. The quality of the obtained images vary a lot depending on the samples and the imaging parameters. Moreover, evaluating this quality is a difficult task. In this work, we want to learn the quality function from scores provided by experts. We propose the use of a deep network that output a quality score for a given image. A user study evaluate the quality of the predictions against human expert scores.

关键词

deep neural network regression;super-resolution microscopy;image quality

全文

PDF


Personalized Human Activity Recognition Using Convolutional Neural Networks

摘要

A major barrier to the personalized Human Activity Recognition using wearable sensors is that the performance of the recognition model drops significantly upon adoption of the system by new users or changes in physical/behavioral status of users. Therefore, the model needs to be retrained by collecting new labeled data in the new context. In this study, we develop a transfer learning framework using convolutional neural networks to build a personalized activity recognition model with minimal user supervision.

关键词

Activity Recognition; Deep Learning; Ubiquitous computing

全文

PDF


Lifelong Learning Networks: Beyond Single Agent Lifelong Learning

摘要

Lifelong machine learning (LML) is a paradigm to design adaptive agents that can learn in dynamic environments. Current LML algorithms consider a single agent that has centralized access to all data. However, given privacy and security constraints, data might be distributed among multiple agents that can collaborate and learn from collective experience. Our goal is to extend LML from a single agent to a network of multiple agents that collectively learn a series of tasks.

关键词

全文

PDF


Predicting Depression Severity by Multi-Modal Feature Engineering and Fusion

摘要

We present our preliminary work to determine if patient's vocal acoustic, linguistic, and facial patterns could predict clinical ratings of depression severity, namely Patient Health Questionnaire depression scale (PHQ-8). We proposed a multi-modal fusion model that combines three different modalities: audio, video, and text features. By training over the AVEC2017 dataset, our proposed model outperforms each single-modality prediction model, and surpasses the dataset baseline with a nice margin.

关键词

全文

PDF


Indirect Reciprocity and Costly Assessment in Multiagent Systems

摘要

Social norms can help solving cooperation dilemmas, constituting a key ingredient in systems of indirect reciprocity (IR). Under IR, agents are associated with different reputations, whose attribution depends on socially adopted norms that judge behaviors as good or bad. While the pros and cons of having a certain public image depend on how agents learn to discriminate between reputations, the mechanisms incentivizing agents to report the outcome of their interactions remain unclear, especially when reporting involves a cost (costly reputation building). Here we develop a new model---inspired in evolutionary game theory---and show that two social norms can sustain high levels of cooperation, even if reputation building is costly. For that, agents must be able to anticipate the reporting intentions of their opponents. Cooperation depends sensitively on both the cost of reporting and the accuracy level of reporting anticipation.

关键词

Multiagent systems, Social norms, Cooperation, Reputations, Evolution

全文

PDF


Relating Children’s Automatically Detected Facial Expressions to Their Behavior in RoboTutor

摘要

Can student behavior be anticipated in real-time so that an intelligent tutor system can adapt its content to keep the student engaged? Current methods detect affective states of students during learning session to determine their engagement levels but apply the learning in next session in the form of intervention policies and tutor responses. However, if students' imminent behavioral action could be anticipated from their affective states in real-time, this could lead to much more responsive intervention policies by the tutor and assist in keeping the student engaged in an activity, thereby increasing tutor efficacy as well as student engagement levels. In this paper we explore if there exist any links between a student's affective states and his/her imminent behavior action in RoboTutor, an intelligent tutor system for children to learn math, reading and writing. We then exploit our findings to develop a real-time student behavior prediction module.

关键词

intelligent tutors; affective state estimation;

全文

PDF


Solving Generalized Column Subset Selection With Heuristic Search

摘要

We address the problem of approximating a matrix by the linear combination of a column sparse matrix and a low rank matrix. Two variants of a heuristic search algorithm are described. The first produces an optimal solution but may be slow, as these problems are believed to be NP-hard. The second is much faster, but only guarantees a suboptimal solution. The quality of the approximation and the optimality criterion can be specified in terms of unitarily invariant norms.

关键词

Heuristic Search; Column Subset Selection; PCA; A* Search

全文

PDF


Towards Better Variational Encoder-Decoders in Seq2Seq Tasks

摘要

Variational encoder-decoders have shown promising results in seq2seq tasks. However, the training process is known difficult to be controlled because latent variables tend to be ignored while decoding. In this paper, we thoroughly analyze the reason behind this training difficulty, compare different ways of alleviating it and propose a new framework that helps significantly improve the overall performance.

关键词

全文

PDF


Efficient Support Vector Machine Training Algorithm on GPUs

摘要

Support Vector Machines (SVMs) are popular for many machine learning tasks. With rapid growth of dataset size, the high cost of training limits the wide use of SVMs. Several SVM implementations on GPUs have been proposed to accelerate SVMs. However, they support only classification (SVC) or regression (SVR). In this work, we propose a simple and effective SVM training algorithm on GPUs which can be used for SVC, SVR and one-class SVM. Initial experiments show that our implementation outperforms existing ones. We are in the process of encapsulating our algorithm into an easy-to-use library which has Python, R and MATLAB interfaces.

关键词

SVM; GPU; Optimization

全文

PDF


Explainable Cross-Domain Recommendations Through Relational Learning

摘要

We propose a method to generate explainable recommendation rules on cross-domain problems. Our two main contributions are: i) using relational learning to generate the rules which are able to explain clearly why the items were recommended to the particular user, ii) using the user's preferences of items on different domains and item attributes to generate novel or unexpected recommendations for the user. To illustrate that our method is indeed feasible and applicable, we conducted experiments on music and movie domains.

关键词

explainable; cross-domain recommendations; rule generation

全文

PDF


Different Cycle, Different Assignment: Diversity in Assignment Problems With Multiple Cycles

摘要

We present approaches to handle diverse assignments in multi-cycle assignment problems. The goal is to assign a task to different agents in each cycle, such that all possible combinations are made over time. Our method combines the original profit value, that is to be optimized by the assignment problem with an additional assignment preference. By merging both, we steer the optimization towards diverse assignments without large trade-offs in the original profits.

关键词

全文

PDF


Dialogue Generation With GAN

摘要

This paper presents a Generative Adversarial Network (GAN) to model multiturn dialogue generation, which trains a latent hierarchical recurrent encoder-decoder simultaneously with a discriminative classifier that make the prior approximate to the posterior. Experiments show that our model achieves better results.

关键词

全文

PDF


Label Space Driven Heterogeneous Transfer Learning With Web Induced Alignment

摘要

Heterogeneous Transfer Learning (HTL) algorithms leverage knowledge from a heterogeneous source domain to perform a task in a target domain. We present a novel HTL algorithm that works even where there are no shared features, instance correspondences and further, the two domains do not have identical labels. We utilize the label relationships via web-distance to align the data of the domains in the projected space, while preserving the structure of the original data.

关键词

Transfer learning; Heterogeneous Transfer Learning

全文

PDF


Uncovering Scene Context for Predicting Privacy of Online Shared Images

摘要

With the exponential increase in the number of images that are shared online every day, the development of effective and efficient learning methods for image privacy prediction has become crucial. Prior works have used as features automatically derived object tags from images' content and manually annotated user tags. However, we believe that in addition to objects, the scene context obtained from images’ content can improve the performance of privacy prediction. Hence, we propose to uncover scene-based tags from images' content using convolutional neural networks. Experimental results on a Flickr dataset show that the scene tags and object tags complement each other and yield the best performance when used in combination with user tags.

关键词

image privacy prediction; scene tags; object tags; Convolutional neural networks; social media; image tagging

全文

PDF


A New Benchmark and Evaluation Schema for Chinese Typo Detection and Correction

摘要

Despite the vast amount of research related to Chinese typo detection, we still lack a publicly available benchmark dataset for evaluation. Furthermore, no precise evaluation schema for Chinese typo detection has been defined. In response to these problems: (1) we release a benchmark dataset to assist research on Chinese typo correction; (2) we present an evaluation schema which was adopted in our NLPTEA 2017 Shared Task on Chinese Spelling Check; and (3) we report new improvements to our Chinese typo detection system ACT.

关键词

NLP

全文

PDF


Exploring Relevance Judgement Inspired by Quantum Weak Measurement

摘要

Quantum Theory (QT) has been applied in a number of fields outside physics, e.g. Information Retrieval (IR). A series of pioneering works have verified the necessity to employ QT in IR user models. In this paper, we explore the process of relevance judgement from a novel perspective of the two state vector quantum weak measurement (WM) by considering context information in time domain. Experiments are carried out to verify our arguments.

关键词

Quantum Weak Measurement; Session Search

全文

PDF


Deep Embedding for Determining the Number of Clusters

摘要

Determining the number of clusters is important but challenging, especially for data of high dimension. In this paper, we propose Deep Embedding Determination (DED), a method that can solve jointly for the unknown number of clusters and feature extraction. DED first combines the virtues of the convolutional autoencoder and the t-SNE technique to extract low dimensional embedded features. Then it determines the number of clusters using an improved density-based clustering algorithm. Our experimental evaluation on image datasets shows significant improvement over state-of-the-art methods and robustness with respect to hyperparameter settings.

关键词

clustering; deep learning; dimension reduction

全文

PDF


Fast Approximate Nearest Neighbor Search via k-Diverse Nearest Neighbor Graph

摘要

Approximate nearest neighbor search is a fundamental problem and has been studied for a few decades. Recently graph-based indexing methods have demonstrated their great efficiency, whose main idea is to construct neighborhood graph offline and perform a greedy search starting from some sampled points of the graph online. Most existing graph-based methods focus on either the precise k-nearest neighbor (k-NN) graph which has good exploitation ability, or the diverse graph which has good exploration ability. In this paper, we propose the k-diverse nearest neighbor (k-DNN) graph, which balances the precision and diversity of the graph, leading to good exploitation and exploration abilities simultaneously. We introduce an efficient indexing algorithm for the construction of the k-DNN graph inspired by a well-known diverse ranking algorithm in information retrieval (IR). Experimental results show that our method can outperform both state-of-the-art precise graph and diverse graph methods.

关键词

indexing; nearest neighbor search

全文

PDF


Discriminative Semi-Supervised Feature Selection via Rescaled Least Squares Regression-Supplement

摘要

In this paper, we propose a Discriminative Semi-Supervised Feature Selection (DSSFS) method. In this method, a ε-dragging technique is introduced to the Rescaled Linear Square Regression in order to enlarge the distances between different classes. An iterative method is proposed to simultaneously learn the regression coefficients, ε-draggings matrix and predicting the unknown class labels. Experimental results show the superiority of DSSFS.

关键词

Feature Selection;Semi-Supervised Feature Selection;Rescaled Linear Square Regression

全文

PDF


Path-Based Attention Neural Model for Fine-Grained Entity Typing

摘要

Fine-grained entity typing aims to assign entity mentions in the free text with types arranged in a hierarchical structure. It suffers from the label noise in training data generated by distant supervision. Although recent studies use many features to prune wrong label ahead of training, they suffer from error propagation and bring much complexity. In this paper, we propose an end-to-end typing model, called the path-based attention neural model (PAN), to learn a noise-robust performance by leveraging the hierarchical structure of types. Experiments on two data sets demonstrate its effectiveness.

关键词

Fine-Grained Entity Typing

全文

PDF


Learning Attention Model From Human for Visuomotor Tasks

摘要

A wealth of information regarding intelligent decision making is conveyed by human gaze and visual attention, hence, modeling and exploiting such information might be a promising way to strengthen algorithms like deep reinforcement learning. We collect high-quality human action and gaze data while playing Atari games. Using these data, we train a deep neural network that can predict human gaze positions and visual attention with high accuracy.

关键词

Eye movements; Visual Attention; Saliency; Deep Learning

全文

PDF


Bayesian Network Structure Learning: The Two-Step Clustering-Based Algorithm

摘要

In this paper we introduce a two-step clustering-based strategy, which can automatically generate prior information from data in order to further improve the accuracy and time efficiency of state-of-the-art algorithms for Bayesian network structure learning. Our clustering-based strategy is composed of two steps. In the first step, we divide the potential nodes into several groups via clustering analysis and apply Bayesian network structure learning to obtain some pre-existing arcs within each cluster. In the second step, with all the within-cluster arcs being well preserved, we learn the between-cluster structure of the given network. Experimental results on benchmark datasets show that a wide range of structure learning algorithms benefit from the proposed clustering-based strategy in terms of both accuracy and efficiency.

关键词

Bayesian Networks; Clustering Analysis; Two-step Structure Learning

全文

PDF


A Semi-Supervised Network Embedding Model for Protein Complexes Detection

摘要

Protein complex is a group of associated polypeptide chains which plays essential roles in biological process. Given a graph representing protein-protein interactions (PPI) network, it is critical but non-trivial to detect protein complexes.In this paper, we propose a semi-supervised network embedding model by adopting graph convolutional networks to effectively detect densely connected subgraphs. We conduct extensive experiment on two popular PPI networks with various data sizes and densities. The experimental results show our approach achieves state-of-the-art performance.

关键词

Network Embedding; Protein Complexes; Semi-Supervised

全文

PDF


Variance Reduced K-Means Clustering

摘要

It is challenging to perform k-means clustering on a large scale dataset efficiently. One of the reasons is that k-means needs to scan a batch of training data to update the cluster centers at every iteration, which is time-consuming. In the paper, we propose a variance reduced k-mean VRKM, which outperforms the state-of-the-art method, and obtain 4× speedup for large-scale clustering. The source code is available on https://github.com/YaweiZhao/VRKM_sofia-ml.

关键词

clustering; k-means clustering; variance reduction;

全文

PDF


Joint Learning of Evolving Links for Dynamic Network Embedding

摘要

This paper studies the problem of learning node embeddings (a.k.a. distributed representations) for dynamic networks. The embedding methods allocate each node in network with a d-dimensions vector, which can generalize across various tasks, such as item recommendation, node labeling, and link prediction. In practice, many real-world networks are evolving with nodes/links added or deleted. However, most of the proposed methods are focusing on static networks. Although some previous researches have shown some promising results to handle the dynamic scenario, they just considered the added links and ignored the deleted ones. In this work, we designed a joint learning of added and deleted links model, named RDEM, for dynamic network embedding.

关键词

node embedding; network embedding; dynamic networks;

全文

PDF


Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

摘要

High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.

关键词

Generative adversarial networks; conditional random fields; spectral-spatial features

全文

PDF


Interactive Machine Learning at Scale With CHISSL

摘要

We demonstrate CHISSL a scalable client-server system for real-time interactive machine learning. Our system is capable of incorporating user feedback incrementally and immediately without a pre-defined prediction task. Computation is partitioned between a lightweight web-client and a heavyweight server. The server relies on representation learning and off-the-shelf agglomerative clustering to find a dendrogram, which we use to quickly approximate distances in the representation space. The client, using only this dendrogram, incorporates user feedback via transduction. Distances and predictions for each unlabeled instance are updated incrementally and deterministically, with O(n) space and time complexity. Our algorithm is implemented in a functional prototype, designed to be easy to use by non-experts. The prototype organizes the large amounts of data into recommendations. This allows the user to interact with actual instances by dragging and dropping to provide feedback in an intuitive manner. We applied CHISSL to several domains including cyber, social media, and geo-temporal analysis.

关键词

semi-supervised learning; user interface; agglomerative clustering

全文

PDF


Lookine: Let the Blind Hear a Smile

摘要

It is believed that nonverbal visual information including facial expressions, facial micro-actions and head movements plays a significant role in fundamental social communication. Unfortunately it is regretful that the blind can not achieve such necessary information. Therefore, we propose a social assistant system, Lookine, to help them to go beyond this limitation. For Lookine, we apply the novel techniques including facial expression recognition, facial action recognition and head pose estimation, and obey barrier-free principles in our design. In experiments, the algorithm evaluation and user study prove that our system has promising accuracy, good real-time performance, and great user experience.

关键词

facial expression recognition; facial action recognition; head pose estimation

全文

PDF


Water Advisor - A Data-Driven, Multi-Modal, Contextual Assistant to Help With Water Usage Decisions

摘要

We demonstrate Water Advisor, a multi-modal assistant to help non-experts make sense of complex water quality data and apply it to their specific needs. A user can chat with the tool about water quality and activities of interest, and the system tries to advise using available water data for a location, applicable water regulations and relevant parameters using AI methods.

关键词

Decision Support; Water Quality; NLP; Data Analysis; User Interfaces; Chatbots; Regulations; Ethics

全文

PDF


A Unified Implicit Dialog Framework for Conversational Commerce

摘要

We propose a unified Implicit Dialog framework for goal-oriented, information seeking tasks of Conversational Commerce applications. It aims to enable the dialog interactions with domain data without replying on the explicitly encoded rules but utilizing the underlying data representation to build the components required for the interactions, which we refer as Implicit Dialog in this work. The proposed framework consists of a pipeline of End-to-End trainable modules. It generates a centralized knowledge representation to semantically ground multiple sub-modules. The framework is also integrated with an associated set of tools to gather end users' input for continuous improvement of the system. This framework is designed to facilitate fast development of conversational systems by identifying the components and the data that can be adapted and reused across many end-user applications. We demonstrate our approach by creating conversational agents for several independent domains.

关键词

dialog system, implicit dialog, knowledge representation

全文

PDF


Vertical Domain Text Classification: Towards Understanding IT Tickets Using Deep Neural Networks

摘要

It is challenging to directly apply text classification models without much feature engineering on domain-specific use cases, and expect the state of art performance. Much more so when the number of classes is large. Convolutional Neural Network (CNN or Con-vNet) has attracted much in text mining due to its effectiveness in automatic feature extraction from text. In this paper, we compare traditional and deep learning approaches for automatic categorization of IT tickets in a real-world production ticketing system. Experimental results demonstrate the good potential of CNN models in our task.

关键词

vertical domain; text classification; IT ticket

全文

PDF


Constructing Domain-Specific Search Engines With No Programming

摘要

We propose a demonstration of myDIG (my Domain-specific Insight Graphs), a system that allows non-technical domain experts, including those with no programming experience, to construct a domain-specific search engine over a raw corpus of webpages. myDIG has been developed and refined over multiple years under the DARPA MEMEX program, and has undergone rigorous user testing with actual domain experts from investigative agencies like the Securities and Exchange Commission (SEC). All components of myDIG are open-source, and the product of fundamental research.

关键词

Knowledge Graphs; Domain-specific Search; Information Extraction

全文

PDF


A Cognitive Assistant for Visualizing and Analyzing Exoplanets

摘要

We demonstrate an embodied cognitive agent that helps scientists visualize and analyze exo-planets and their host stars. The prototype is situated in a room equipped with a large display, microphones, cameras, speakers, and pointing devices. Users communicate with the agent via speech, gestures, and combinations thereof, and it responds by displaying content and generating synthesized speech. Extensive use of context facilitates natural interaction with the agent.

关键词

全文

PDF


Perception-Action-Learning System for Mobile Social-Service Robots Using Deep Learning

摘要

We introduce a novel perception-action-learning system for mobile social-service robots. The state-of-the-art deep learning techniques were incorporated into each module which significantly improves the performance in solving social service tasks. The system not only demonstrated fast and robust performance in a homelike environment but also achieved the highest score in the RoboCup2017@Home Social Standard Platform League (SSPL) held in Nagoya, Japan.

关键词

Social Service Robots; Mobile Robots; Artificial Intelligence; Deep Learning; Integrated System

全文

PDF


Agent Assist: Automating Enterprise IT Support Help Desks

摘要

In this paper, we present Agent Assist, a virtual assistant which helps IT support staff to resolve tickets faster. It is essentially a conversation system which provides procedural and often complex answers to queries. This system can ingest knowledge from various sources like application documentation, ticket management systems and knowledge transfer video recordings. It uses an ensemble of techniques like question classification, knowledge graph based disambiguation, information retrieval, etc., to provide quick and relevant solutions to problems from various technical domains and is currently being used in more than 650 projects within IBM.

关键词

Question Answering, Cognitive Ensemble, IT support, Machine Learning

全文

PDF


Dataset Evolver: An Interactive Feature Engineering Notebook

摘要

We present DATASET EVOLVER, an interactive Jupyter notebook-based tool to support data scientists perform feature engineering for classification tasks. It provides users with suggestions on new features to construct, based on automated feature engineering algorithms. Users can navigate the given choices in different ways, validate the impact, and selectively accept the suggestions. DATASET EVOLVER is a pluggable feature engineering framework where several exploration strategies could be added. It currently includes meta-learning based exploration and reinforcement learning based exploration. The suggested features are constructed using well-defined mathematical functions and are easily interpretable. Our system provides a mixed-initiative system of a user being assisted by an automated agent to efficiently and effectively solve the complex problem of feature engineering. It reduces the effort of a data scientist from hours to minutes.

关键词

Feature Engineering; Classification

全文

PDF


PegasusN: A Scalable and Versatile Graph Mining System

摘要

How can we find patterns and anomalies in peta-scale graphs? Even recently proposed graph mining systems fail in processing peta-scale graphs. In this work, we propose PegasusN, a scalable and versatile graph mining system that runs on Hadoop and Spark. To handle enormous graphs, PegasusN provides and seamlessly integrates efficient algorithms for various graph mining operations: graph structure analyses, subgraph enumeration, graph generation, and graph visualization. PegasusN quickly processes extra-large graphs that other systems cannot handle.

关键词

graph mining; distributed system; hadoop; spark

全文

PDF


BaitBuster: A Clickbait Identification Framework

摘要

The use of tempting and often misleading headlines (clickbait) to allure readers has become a growing practice nowadays among the media outlets. The widespread use of clickbait risks the reader’s trust in media. In this paper, we present BaitBuster, a browser extension and social bot based framework, that detects clickbaits floating on the web, provides brief explanation behind its decision, and regularly makes users aware of potential clickbaits.

关键词

Disinformation; Social Media; Machine Learning

全文

PDF


Democratization of Deep Learning Using DARVIZ

摘要

With an abundance of research papers in deep learning, adoption and reproducibility of existing works becomes a challenge. To make a DL developer life easy, we propose a novel system, DARVIZ, to visually design a DL model using a drag-and-drop framework in an platform agnostic manner. The code could be automatically generated in both Caffe and Keras. DARVIZ could import (i) any existing Caffe code, or (ii) a research paper containing a DL design; extract the design, and present it in visual editor.

关键词

Deep Learning; Research Paper Mining; DARVIZ

全文

PDF


Learning an Image-based Obstacle Detector With Automatic Acquisition of Training Data

摘要

We detect and localize obstacles in front of a mobile robot by means of a deep neural network that maps images acquired from a forward-looking camera to the outputs of five proximity sensors. The robot autonomously acquires training data in multiple environments; once trained, the network can detect obstacles and their position also in unseen scenarios, and can be used on different robots, not equipped with proximity sensors. We demonstrate both the training and deployment phases on a small modified Thymio robot.

关键词

全文

PDF


MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence

摘要

We introduce MAgent, a platform to support research and development of many-agent reinforcement learning. Unlike previous research platforms on single or multi-agent reinforcement learning, MAgent focuses on supporting the tasks and the applications that require hundreds to millions of agents. Within the interactions among a population of agents, it enables not only the study of learning algorithms for agents' optimal polices, but more importantly, the observation and understanding of individual agent's behaviors and social phenomena emerging from the AI society, including communication languages, leaderships, altruism. MAgent is highly scalable and can host up to one million agents on a single GPU server. MAgent also provides flexible configurations for AI researchers to design their customized environments and agents. In this demo, we present three environments designed on MAgent and show emerged collective intelligence by learning from scratch.

关键词

reinforcement learning; multiagent system; learning environment;

全文

PDF