In this paper, we propose a new internally buffered crossbar (IBC) switching architecture where the input and output distributed schedulers are embedded inside the crossbar fabric chip. As opposed to previous designs, where these schedulers are spread across input and output line cards, our design allows the schedulers to have cheap and fast access to the internal buffers, optimizes the flow control mechanism and makes the IBC more scalable. We employed the Xilinx Virtex-4FX platform to show the feasibility of our proposal and implemented a reconfigurable hardware based IBC switch with the maximum port count that we could fit on a single chip. The experiments suggest that a 24