Discussion threads form a central part of the experience on many Web sites, including social networking sites such as Facebook and Google Plus and knowledge creation sites such as Wikipedia. To help users manage the challenge of allocating their attention among the discussions that are relevant to them, there has been a growing need for the algorithmic curation of on-line conversations — the development of automated methods to select a subset of discussions to present to a user.
Here we consider two key sub-problems inherent in conversational curation: length prediction — predicting the number of comments a discussion thread will receive — and the novel task of re-entry prediction — predicting whether a user who has participated in a thread will later contribute another comment to it. The first of these sub-problems arises in estimating how interesting a thread is, in the sense of generating a lot of conversation; the second can help determine whether users should be kept notified of the progress of a thread to which they have already contributed. We develop and evaluate a range of approaches for these tasks, based on an analysis of the network structure and arrival pattern among the participants, as well as a novel dichotomy in the structure of long threads. We find that for both tasks, learning-based approaches using these sources of information.