October 28, 2019

Distributed Systems Research request for proposals

About

Modern data centers operate with hundreds of software systems, running in millions of containers, serving billions of requests per second [Hahn LISA ’18]. At such a large scale, high operational flexibility, debuggability, and high reliability are paramount.

Unfortunately, at such a large scale, three key challenges are at odds with achieving high operational flexibility, debuggability, and high reliability:

  1. Interdependence. The programs running in modern data centers make up larger workloads, and failures in one program can have a ripple effect, causing failures in other programs or in the entire workload.
  2. Distribution. The workloads in modern data centers are distributed across many servers – and even across the globe – and server failures in one location can have widespread implications across the site.
  3. Heterogeneous hardware. Modern data centers are composed of a diverse mix of hardware, from commodity devices to special-purpose accelerators, and servers are configured with different compute, memory, and storage profiles. In such an environment, fail-slow behavior, where only a single job makes forward progress but becomes slow, can collapse cluster performance [Gunawi+ FAST ’18].

At Facebook, we are performing forward-looking research into the area of distributed systems, applying key techniques from the field at Facebook’s scale and sharing our designs, implementations, insights, and data with the community. In recent years, we’ve released our work on Distributed Configuration Management [Tang+ SOSP ’15], Large-Scale Production Load Testing [Veeraraghavan+ OSDI ’16], End-to-End Performance Tracing [Kaldor+ SOSP ’17], and more [Huang+ SOSP ’17, Veeraraghavan+ OSDI ’18, Annamalai+ OSDI ’18].

To understand the future challenges that have yet to emerge, it is important that Facebook builds a strong collaboration pipeline with key experts in academia. Therefore, Facebook is pleased to invite faculty and graduate students to respond to this year’s call for research proposals pertaining to distributed systems. We anticipate giving eight awards, each around $50,000 as an unrestricted gift to the proposer’s host university.

While all proposals are welcome, we are particularly interested in those that address fundamental challenges that arise in distributed systems operating at an extremely large scale. Example topics include:

  • Distributed performance tracing and analysis
  • Efficient shard mapping and placement (on the order of billions of shards)
  • Request load balancing (e.g., for HTTP and RPC traffic)
  • Fault tolerance (with a focus on power and network fault domains)
  • Large-scale configuration management, monitoring, and deployment
  • Efficient use of hardware resources via software codesign

Applicants should submit a proposal detailing what contribution their research is expected to make, how the broader research community will benefit from the work, a project timeline, and an overview of how the proposed funding will be used.

Proposals should include

  • A summary of the project (1-2 pages) explaining the area of focus, a description of techniques, any relevant prior work, and a timeline with milestones and expected outcomes
  • A draft budget description (1 page) including an approximate cost of the award and an explanation of how funds would be spent
  • Curriculum vitae for each project participant
  • Organization details; this will include tax information and administrative contact details

Eligibility

  • Awards must comply with applicable U.S. and international laws, regulations, and policies.
  • Applicants must be current full-time faculty at an accredited academic institution that awards research degrees to PhD students.
  • Applicants must be the principal investigator on any resulting award.

Additional information

Winners will be invited to a Facebook event at one of our data centers for a tour and summit discussing the winning proposals and an opportunity for networking with Facebook engineers. The event will be sometime in spring 2020, but the exact date and location have yet to be determined. Facebook will pay for the winners’ travel and accommodations to attend (one representative per winning proposal).

Timing and dates

Applications are now open. Deadline to apply is December 6 at 11:59 p.m. AOE.
Notifications will be sent by email to selected applicants in January 2020.

Terms and Conditions

  • By submitting a proposal, you are authorizing Facebook to evaluate the proposal for a potential award, and you agree to the terms herein.
  • You agree that Facebook will not be required to treat any part of the proposal as confidential or protected by copyright, and may use, edit, modify, copy, reproduce, and distribute all or a portion of your proposal in any manner for the sole purposes of administering the website and evaluating the contents of the proposal.
  • You agree and acknowledge that personal data submitted with the proposal, including name, mailing address, phone number, and email address of you and other named researchers in the proposal may be collected, processed, stored and otherwise used by Facebook for the purposes of administering the website and evaluating the contents of the proposal.
  • You acknowledge that neither party is obligated to enter into any business transaction as a result of the proposal submission, Facebook is under no obligation to review or consider the proposal, and neither party acquires any intellectual property rights as a result of submitting the proposal.
  • Any feedback you provide to Facebook in the proposal regarding its products or services will not be treated as confidential or protected by copyright, and Facebook is free to use such feedback on an unrestricted basis with no compensation to you.

 

Apply

Apply

Frequently Asked Questions