The DMT (discrete multitone modulation) technique has been widely applied to data transmission over fading channels of twisted pairs. It has been shown that the DMT system with ideal filters can achieve within 8 to 9 dB of the channel capacity of ADSL. The DFT based DMT system is proposed as a practical DMT implementation but its optimality is never asserted. In this paper we will show that the DFT based DMT systems are asymptotically optimal although they are not optimal for finite number of channels. The DFT based DMT system and the DMT system with ideal filters achieve the same bound. However, for a modest number of channelsthe optimal transceivercan provide substantial gain over the DFT based system as will be demonstrated by examples.