All Research Areas
Research Areas
Year Published

56 Results

August 11, 2013

MI2LS: Multi-Instance Learning from Multiple Information Sources

ACM Conference on Knowledge Discovery and Data Mining (KDD)

In Multiple Instance Learning (MIL), each entity is normally expressed as a set of instances. Most of the current MIL methods only deal with the case when each instance is represented by one type of f…

By: Dan Zhang, Jingrui He, Richard Lawrence
June 17, 2013

MILEAGE: Multiple Instance LEArning with Global Embedding

International Conference on Machine Learning (ICML)

Multiple Instance Learning (MIL) generally represents each example as a collection of instances such that the features for local objects can be better captured, whereas traditional methods typically extract a global feature vector for each example as an integral part. However, there is limited research work on investigating which of the two learning scenarios performs better.

By: Dan Zhang, Jingrui He, Luo Si, Richard Lawrence
May 20, 2013

Machine Learning Paradigms for Speech Recognition: An Overview

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, Bay…

By: Li Deng, Xiao Li
May 13, 2013

Latent Credibility Analysis

International World Wide Web Conference (WWW)

A frequent problem when dealing with data gathered from multiple sources on the web (ranging from booksellers to Wikipedia pages to stock analyst predictions) is that these sources disagree, and we must decide which of their (often mutually exclusive) claims we should accept. Current state-of-the-art information credibility algorithms known as ‘fact-finders’ are transitive voting systems with rules specifying how votes iteratively flow from sources to claims and then back to sources.

By: Jeff Pasternack, Dan Roth
May 13, 2013

CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks

International World Wide Web Conference (WWW)

In this paper we focus on the social network Facebook and the problem of discerning ill-gotten Page Likes, made by spammers hoping to turn a profit, from legitimate Page Likes. Our method, which we refer to as CopyCatch, detects lockstep Page Like patterns on Facebook by analyzing only the social graph between users and Pages and the times at which the edges in the graph (the Likes) were created.

By: Alex Beutel, Christopher Palow, Christos Faloutsos, Tom Wanhong Xu, Venkatesan Guruswami
May 1, 2013

Hash Bit Selection: a Unified Solution for Selection Problems in Hashing

Conference on Computer Vision and Pattern Recognition (CVPR)

Hashing based methods recently have been shown promising for large-scale nearest neighbor search. However, good designs involve difficult decisions of many unknowns – data features, hashing algorithms, parameter settings, kernels, etc.

By: Xianglong Liu, Junfeng He, Bo Lang, Shih-Fu Chang
August 12, 2012

Active Sampling for Entity Matching

ACM Conference on Knowledge Discovery and Data Mining (KDD)

In entity matching, a fundamental issue while training a classifier to label pairs of entities as either duplicates or non-duplicates is the one of selecting informative examples. Although active learning presents an attractive solution to this problem, previous approaches minimize the misclassification rate (0-1 loss) of the classifier, which is an unsuitable metric for entity matching due to class imbalance (i.e., many more non-duplicate pairs than duplicate pairs).

By: Kedar Bellare, Suresh Iyengar Parthasarathy, Aditya Parameswaran, Vibhor Rastogi
March 1, 2012

Bootstrapping Data Arrays of Arbitrary Order

The Annals of Applied Statistics (AOAS)

In this paper we study a bootstrap strategy for estimating the variance of a mean taken over large multifactor crossed random effects data sets. We apply bootstrap reweighting independently to the lev…

By: Art B. Owen, Dean Eckles
August 15, 2011

Phonetic Classification Using Controlled Random Walks

Conference of the International Speech Communication Association (Interspeech)

Recently, semi-supervised learning algorithms for phonetic classifiers have been proposed that have obtained promising results. Often, these algorithms attempt to satisfy learning criteria that are not inherent in the standard generative or discriminative training procedures for phonetic classifiers.

By: Katrin Kirchhoff, Andrei Alexandrescu
July 24, 2011

Learning Relevance from a Heterogeneous Social Network and Its Application in Online Targeting

ACM Special Interest Group on Information Retrieval (SIGIR)

The rise of social networking services in recent years presents new research challenges for matching users with interesting content. While the content-rich nature of these social networks offers many…

By: Chi Wang, Rajat Raina, David Fong, Ding Zhou, Jiawei Han, Greg Badros