Byzantine agreement algorithms typically assume implicit initial state consistency and synchronization among the correct nodes and then operate in coordinated rounds of information exchange to reach agreement based on the input values. The implicit initial assumptions enable correct nodes to infer about the progression of the algorithm at other nodes from their local state. This paper considers a more severe fault model than permanent Byzantine failures, one in which the system can in addition be subject to severe transient failures that can temporarily throw the system out of its assumption boundaries. When the system eventually returns to behave according to the presumed assumptions it may be in an arbitrary state in which any synchronization among the nodes might be lost, and each node may be at an arbitrary state. We present a self-stabilizing Byzantine agreement algorithm that reaches agreement among the correct nodes in optimal time, by using only the assumption of bounded messa...