Audiovisual classification of vocal outbursts in human conversation using Long-Short-Term Memory networks