The hinge between input and output: understanding the multimodal input fusion results in an agent-based multimodal presentation