Over the last decades Internet traffic has grown dramatically. Besides the number of transfers, data sizes have risen as well. Traditional transfer protocols do not adapt to this evolution. Large-scale computational applications running on expensive parallel computers produce large amounts of data which often have to be transferred to weaker machines at the clients’ premises. As parallel computers are frequently charged by the minute, it is indispensable to minimize the transfer time after computation succeeded to keep down costs. Consequently, the economic focus lies on minimizing the time to move away all data from the parallel computer whereas the actual time to arrival remains less (but still) important. This paper describes the design and implementation of a new transfer protocol, the Fast Send Protocol (FSP), which employs striping to intermediate nodes in order to minimize sending time and to utilize the sender’s resources to a high extent.