Sentence compression reduces the length of text by removing non-essential
content while preserving important facts and grammaticality. Unsupervised
objective driven methods for sentence compression can be used to create
customized models without the need for ground-truth training data, while
allowing flexibility in the objective function(s) that are used for learning
and inference. Recent unsupervised sentence compression approaches use custom
objectives to guide discrete search; however, guided search is expensive at
inference time. In this work, we explore the use of reinforcement learning to
train effective sentence compression models that are also fast when generating
predictions. In particular, we cast the task as binary sequence labelling and
fine-tune a pre-trained transformer using a simple policy gradient approach.
Our approach outperforms other unsupervised models while also being more
efficient at inference time.