Buffered crossbar switches are a special type of combined input-output queued switches with each crosspoint of the crossbar having small on-chip buffers. The introduction of crosspoint buffers greatly simplifies the scheduling process of buffered crossbar switches, and furthermore enables buffered crossbar switches with speedup of two to easily provide port based performance guarantees. However, recent research results have indicated that, in order to provide flow based performance guarantees, buffered crossbar switches have to either increase the speedup of the crossbar to three or greatly increase the total number of crosspoint buffers, both adding significant hardware complexity. In this paper, we present scheduling algorithms for buffered crossbar switches to achieve flow based performance guarantees with speedup of two and with only one or two buffers at each crosspoint. When there is no crosspoint blocking in a specific time slot, only the simple and distributed input sched...