August 28, 2017

Social Hash Partitioner: A Scalable Distributed Hypergraph Partitioner

Very Large Data Bases Conference (VLDB)

We design and implement a distributed algorithm for balanced k-way hypergraph partitioning that minimizes fanout, a fundamental hypergraph quantity also known as the communication volume and (k − 1)-cut metric, by optimizing a novel objective called probabilistic fanout. This choice allows a simple local search heuristic to achieve comparable solution quality to the best existing hypergraph partitioners.

Igor Kabiljo, Brian Karrer, Mayank Pundir, Sergey Pupyrev, Alon Shalita
August 21, 2017

Engineering Egress with Edge Fabric: Steering Oceans of Content to the World

ACM SIGCOMM

Large content providers build points of presence around the world, each connected to tens or hundreds of networks. Ideally, this connectivity lets providers better serve users, but providers cannot obtain enough capacity on some preferred peering paths to handle peak traffic demands. These capacity constraints, coupled with volatile traffic and performance and the limitations of the 20 year old BGP protocol, make it difficult to best use this connectivity. This paper presents Edge Fabric, an SDN-based system we built and deployed to tackle these challenges for Facebook, which serves over two billion users from dozens of points of presence on six continents.

Brandon Schlinker, Hyojeong Kim, Timothy Cui, Ethan Katz-Bassett, Harsha V. Madhyastha, Italo Cunha, James Quinn, Saif Hasan, Petr Lapukhov, James Hongyi Zeng
August 21, 2017

SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs

Association for Computing Machinery's Special Interest Group on Data Communications (SIGCOMM)

In this paper, we show that up to hundreds of software load balancer (SLB) servers can be replaced by a single modern switching ASIC, potentially reducing the cost of load balancing by over two orders of magnitude. Today, large data centers typically employ hundreds or thousands of servers to load-balance incoming traffic over application servers.

Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, Minlan Yu
August 14, 2017

Malicious Browser Extensions at Scale: Bridging the Observability Gap between Web Site and Browser

USENIX Workshop on Cyber Security Experimentation and Test

In this paper we describe an approach used at Facebook for dealing with this problem. We present a methodology whereby users exhibiting suspicious online behaviors are scanned (with permission) to identify the set of extensions in their browser, and those extensions are in turn labelled based on the threat indicators they contain.

Louis F. DeKoven, Stefan Savage, Geoffrey M. Voelker, Nektarios Leontiadis
August 6, 2017

Unsupervised Learning by Predicting Noise

International Conference on Machine Learning (ICML)

Convolutional neural networks provide visual features that perform well in many computer vision applications. However, training these networks requires large amounts of supervision; this paper introduces a generic framework to train such networks, end-to-end, with no supervision. We propose to fix a set of target representations, called Noise As Targets (NAT), and to constrain the deep features to align to them.

Piotr Bojanowski, Armand Joulin
August 6, 2017

Wasserstein Generative Adversarial Networks

International Conference on Machine Learning (ICML)

We introduce a new algorithm named WGAN, an alternative to traditional GAN training. In this new model, we show that we can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches.

Martin Arjovsky, Soumith Chintala, Leon Bottou
August 6, 2017

Language Modeling with Gated Convolutional Networks

International Conference on Machine Learning (ICML)

The pre-dominant approach to language modeling to date is based on recurrent neural networks. Their success on this task is often linked to their ability to capture unbounded context. In this paper we develop a finite context approach through stacked convolutions, which can be more efficient since they allow parallelization over sequential tokens.

Yann Dauphin, Angela Fan, Michael Auli, David Grangier
August 6, 2017

Efficient Softmax Approximation for GPUs

International Conference on Machine Learning (ICML)

We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.

Edouard Grave, Armand Joulin, Moustapha Cisse, David Grangier, Hervé Jégou
August 6, 2017

Convolutional Sequence to Sequence Learning

International Conference on Machine Learning (ICML)

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent […]

Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin
July 31, 2017

Focal Surface Displays

SIGGRAPH 2017

We introduce focal surface displays to meet the challenge of vergence-accommodation conflict, augmenting conventional HMDs with a phase-only spatial light modulator (SLM) placed between the display screen and viewing optics. This SLM acts as a dynamic freeform lens, shaping synthesized focal surfaces to conform to the virtual scene geometry.

Nathan Matsuda, Alexander Fix, Douglas Lanman
July 31, 2017

Enriching Word Vectors with Subword Information

TACL, Association for Computational Linguistics (ACL 2017)

Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models that learn […]

Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov
July 31, 2017

Learning Multilingual Joint Sentence Embeddings with Neural Machine Translation

ACL workshop on Representation Learning for NLP (ACL)

In this paper, we use the framework of neural machine translation to learn joint sentence representations across six very different languages. Our aim is that a representation which is independent of the language, is likely to capture the underlying semantics.

Holger Schwenk, Matthijs Douze
July 30, 2017

Reading Wikipedia to Answer Open-Domain Questions

Association for Computational Linguistics (ACL 2017)

This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article.

Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes
July 30, 2017

Automatically Generating Rhythmic Verse with Neural Networks

Association for Computational Linguistics (ACL 2017)

We propose two novel methodologies for the automatic generation of rhythmic poetry in a variety of forms.

Jack Hopkins
July 24, 2017

Untagging on Social Media: Who Untags, What do they Untag, and Why?

Journal: Computers in Human Behavior

Using de-identified, aggregated behavioral data from Facebook and a survey of 802 people, this paper aims to explore untagging by asking whether untagging occurs similarly to other self-presentation behavior and how people view this strategy.

Jeremy Birnholt, Moira Burke, Annie Steele
July 22, 2017

Densely Connected Convolutional Networks

CVPR 2017

In this paper, we embrace the observation that hat convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output, and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion.

Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger
July 21, 2017

Discovering Causal Signals in Images

CVPR 2017

This paper establishes the existence of observable footprints that reveal the “causal dispositions” of the object categories appearing in collections of images.

David Lopez-Paz, Robert Nishihara, Soumith Chintala, Bernhard Scholkopf, Leon Bottou
July 21, 2017

Relationship Proposal Networks

Conference on Computer Vision and Pattern Recognition 2017

Image scene understanding requires learning the relationships between objects in the scene. A scene with many objects may have only a few individual interacting objects (e.g., in a party image with many people, only a handful of people might be speaking with each other). To detect all relationships, it would be inefficient to first detect all individual objects and then classify all pairs; not only is the number of all pairs quadratic, but classification requires limited object categories, which is not scalable for real-world images. In this paper we address these challenges by using pairs of related regions in images to train a relationship proposer that at test time produces a manageable number of related regions.

Ji Zhang, Mohamed Elhoseiny, Scott Cohen, Walter Chang, Ahmed Elgammal
July 21, 2017

Proton Testing Results for Kaman KD-5100 Differential Inductive Position Measuring Systems

Journal, IEEE Radiation Effects Data Workshop (REDW), in press

We report proton testing of a position measuring system, the Kaman KD-5100, with applications including mirror positioning for laser beam control. We measure a device response likely due to total ionizing dose and/or displacement damage.

Bart McGuyer, Randall Milanowski, Slaven Moro, Norman Hall, Bert Vermeire
July 21, 2017

Link the head to the “beak”: Zero Shot Learning from Noisy Text Description at Part Precision

CVPR 2017

In this paper, we study learning visual classifiers from unstructured text descriptions at part precision with no training images. We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations. F

Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal