Late CMOS scaling reduces device reliability, and existing work has studied the permanent SER (soft error rate) for configuration memory in FPGA extensively. In this paper, we show that continuous CMOS scaling dramatically increases the significance of FPGA chip-level transient soft errors in circuit elements other than configuration memory, and transient SER can no longer be ignored. We then develop an efficient, yet accurate, transient SER evaluation method, called trace based methodology, considering logic, electrical and latch-window maskings. By collecting traces on logic probability and sensitivity and re-using these traces for different device settings, we finally perform device and architecture concurrent optimization considering hundreds of device and architecture combinations. Compared to the commonly used FPGA architecture and device settings, device and architecture concurrent optimization can reduce the transient SER by 2.8X and reduce the product of en