We present a multimodal interaction framework that allows a human operator to interact with co-located drones during search and rescue missions. In contrast with usual human-multidrones interaction scenarios, in this case the operator is not fully dedicated to the control of the robots, but directly involved in search and rescue tasks, hence only able to provide fast, although high-value, instructions to the robots. This scenario requires a framework that supports intuitive multimodal communication along with an effective and natural mixed-initiative interaction between the human and the robots. In this work, we describe the domain along with the designed multimodal interaction framework.