learn to combine modalities in multimodal deep learning