Text this: Learning enhanced features and inferring twice for fine-grained image classification