User-managed-events is a popular feature on social networks. Take Facebook Events as an example: over 135 million events were created in 2015 and over 550 million people use events each month. In this work, we consider the heavy sparseness in both user and event feedback history caused by short lifespans (transiency) of events and user participation patterns in a production event system. We propose to solve the resulting cold-start problems by introducing a joint representation model to project users and events into the same latent space. Our model based on parallel Convolutional Neural Networks captures semantic meaning in event text and also utilizes heterogeneous user knowledge available in the social network. By feeding the model output as user and event representation into a combiner prediction model, we show that our representation model improves the prediction accuracy over existing techniques (+6% AUC lift). Our method provides a generic way to match heterogeneous information from different domains and applies to a wide range of applications in social networks.