November 13, 2019

Information quality processes for Community Standards Violations metrics

By: Radha Iyengar Plumb

Radha Iyengar Plumb is the Global Head of Policy Analysis at Facebook. She leads the team that helps ensure Facebook’s content policies are closely based on rigorous research and are aligned with best practices.

Our Community Standards and Community Guidelines define what is and isn’t allowed on Facebook and Instagram. The Community Standards Enforcement Report tracks how we’re doing at enforcing these policies, and the report has evolved considerably since its initial publication in May 2018. Over time, we have expanded the report by adding new metrics, shared data on more policy areas, and refined our methodologies to more accurately measure our efforts at enforcing these policies and keeping people safe. We continue to mature and improve these methodologies over time as part of our commitment to providing meaningful and accurate metrics.

This post describes the processes by which we review and validate the metrics we report. We’ll continue to update the bottom of this post if our findings require adjustments to shared metrics.

Today, we are also sharing specific adjustments to metrics previously provided for Q3 2018, Q4 2018, and Q1 2019.

Why we developed information quality processes

We are committed to transparently sharing our metrics as well as the processes we use to calculate and improve them. To streamline and better govern the release of adjustments and corrections to our methodologies and metrics, we developed an information quality procedure to identify, rectify, and publicly report any adjustments we make to previously released information. This is a common practice in large statistical agencies and federal agency public reports and was developed in line with data reporting best practices in both public and private sectors. These reviews and procedures we developed will be critical in maintaining the accuracy and integrity of our reporting going forward.

We constantly evaluate and validate our metrics and make sure the information we are sharing is accurate and our methodologies to generate this data are sound.

In July 2019, as part of this work, we updated the methodologies we use to generate some of these metrics. We also fixed a discrepancy in our accounting processes and adjusted our reported figures from Q3 2018 to Q1 2019. We have adjusted the metrics and shared them below. In the November 2019 edition of the Community Standards Report, corrections were made in the following metric categories:

  • Content actioned: How much content we took action on because it was found to violate our policies.
  • Proactive rate: Of the content we took action on, how much we found before someone reported it to us.
  • Appealed content: How much content people appealed after we took action.
  • Restored content: How much actioned content was later restored.

(Note: The November 2019 report did not contain any changes or adjustments to the prevalence metrics, which is the percentage of views that contain content that violates our policies.)

How we evaluate and improve our metrics

We’re constantly refining our processes and methodologies in order to provide the most meaningful and accurate numbers on how we’re enforcing our policies. Over the summer of 2019, we implemented information quality processes that create further checks and balances in order to make sure we share valid and consistent metrics.

Identifying priority scenarios to confirm validity

We identify different dimensions of each metric and develop a risk-prioritization of segments that may significantly affect the metrics. For the segments in this prioritized list, we implement multiple checks to make sure these segments are capturing information accurately.

For example, we break out our content actioned metrics into multiple dimensions to review. For example, we separate out content based on whether our automated systems or human reviewers took the action, what led us to take action, and what type of content (photos, text, video) we took action on.

With these different dimensions, we then assess how much bias would be introduced into our measurement if that dimension was not correctly represented in the metric (for example, if we didn’t include video content in our metrics). These assessments allow us to identify dimensions that might impact the metric (such as whether humans took action).

Then, we figure out how much the metric could be impacted if that dimension was wrong (say we didn’t log any of the content humans took action on). We then prioritize the biggest risk scenarios to do additional cross-checks. For these high-risk combinations, we develop additional tracking and cross-check systems to ensure these metrics are estimated correctly.

Validity and consistency checks

We have also implemented consistency checks to add more validation for our metrics. These include the following:

  • Consistency checks. We periodically measure our actions with a separate, independent system that measures content actions. On a regular basis, we check these various independent metrics which are intended to identify large errors in our accounting.
  • Auditing and debugging our measurement tools. We conduct a range of random spot checks to verify the accuracy of our measurement systems in near real-time. This includes checking various outcomes that happen later in our system to double check upstream outcomes. For example, we confirm that content that is appealed is also logged as content that has been actioned since content must be actioned in order to be appealed. Many of these checks are intended to identify large errors such as content that is appealed but was never removed.

As with all aspects of our standards enforcement reporting, we will continue to evolve and improve our validity and consistency review processes over time.

Internal review and correction procedures

We also established procedures to identify and correct information previously shared in our enforcement report, which we will regularly review and update. When we identify potential issues in metrics shared in the Community Standards Enforcement Report, we follow these steps:

  • Reporting. If a potential issue is discovered, our teams immediately file an incident report that alerts the relevant teams to begin investigating the issue.
  • Investigating and mitigating. The relevant teams review the potential issue, making immediate changes to prevent further consistency issues where necessary and developing solutions to avoid the issue in the future.
  • Sizing the issue. Once a solution has been identified, it is applied to past data, when possible, to understand whether previous metrics were impacted and revise them as needed.
  • Post-mortem incident review. Once the issue is mitigated, we conduct a detailed internal review to identify the root causes and full impact of the issue. This allows us to identify broader risks to the validity of our measurement so we can prevent or minimize them.

Reporting adjustments

Once we identify an issue and adjust the affected metric, we will publicly report the correction by updating this post at the time of the subsequent release of the Community Standards Enforcement Report. In such an update, we will describe the issue, the metrics impacted, and the time periods impacted. The data for the quarters previously affected in the Content Standards Enforcement Report itself will include any adjusted metrics when feasible to ensure comparisons over time are meaningful.

Seeking input and expanding the metrics we report

In addition to the work we do internally to evaluate and improve our metrics, we also look for external input on our methodologies and expand the metrics we report on to give a more robust picture of how we’re doing at enforcing our policies.

Methodology assessment and input

To ensure our methods are transparent and based on sound principles, we seek out analysis and input from subject matter experts on areas such as whether the metrics we provide are informative.

In order to ensure our approach to measuring content enforcement was meaningful and accurate, we worked with the Data Transparency Advisory Group (DTAG), an external group of international academic experts in measurement, statistics, criminology, and governance. In May 2019, they provided their independent, public assessment of whether the metrics we share in the Community Standards Enforcement Report provide accurate and meaningful measures of how we enforce our policies, as well as the challenges we face in this work, and what we do to address them. Overall, they found our metrics to be reasonable ways of measuring violations and in line with best practices. They also provided a number of recommendations for how we can continue to be more transparent about our work, which we discussed in detail and continue to explore.

Corrections and adjustments

This section details any specific adjustments identified through our information quality practices. We will update this section in accordance with the processes described above.

 

Date Metrics impacted Quarters impacted Description Adjustments
11/2019 Content actioned, proactive rate for spam on Facebook, appealed content, restored content on Facebook 2018 Q3 and Q4; 2019 Q1 Correction to ensure logging accuracy for posts with single content and posts that are later shared See Correction Table 1 below
11/2019 Spam All time periods Undercount due to spam operational strategy. Review underway. Correction not available

Correction Table 1: Content actioned, proactive rate, appealed content, and restored content metric adjustments, by quarter and violation


11/2019: Content actioned, proactive content actioned, content appealed by users, and content restored by Facebook

When we shared the second edition of the Community Standards Enforcement Report in November 2018, we updated our method for counting how we take actions on content. We did this so that the metrics better reflected what happens on Facebook when we take action on content for violating our Community Standards. For example, if we find that a post containing one photo violates our policies, we want our metric to reflect that we took action on one piece of content – not two separate actions for removing the photo and the post.

However, in July 2019, we found that the systems logging and counting these actions did not correctly log the actions taken. This was largely due to needing to count multiple actions that take place within a few milliseconds and not miss, or overstate, any of the individual actions taken. Because our logging system for measurement purposes is distinct from our operations to enforce our policies, the issue with our accounting did not impact how we enforced our policies or how we informed people about those actions; it only impacted how we counted the actions we took. As soon as we discovered this issue, we worked to fix it, identify any incorrect metrics previously shared and establish a more robust set of checks in our processes to ensure the accuracy of our accounting. In total, we found that this issue impacted the numbers we had previously shared for content actioned, proactive rate, appealed content, and restored content for Q3 2018, Q4 2018, and Q1 2019.

The fourth edition of the Community Standards Enforcement Report includes the correct metrics for the affected quarters and the table linked above provides the previously reported metrics and their corrections.

11/2019: Content actioned, proactive rate for spam on Facebook

At Facebook, different systems take action on different types of content to improve efficiency and reliability for the billions of actions happening every quarter. One of these systems, which acts mainly on content with links, did not log our actions for certain content that was removed if no one tried to view it within seven days of it being created, even if this content was removed from the platform. While we know this undercounts the true number of content containing external links, mainly affecting our spam metrics for content containing malicious links, we are not currently able to retrospectively size this undercounting. As such, the numbers currently reflected in the community Standards Enforcement Report represent a minimum estimate of both content actioned and proactive rate for the impacted period. Updates about this issue will be posted here when available.