—A major challenge in biometrics is performing the test at the client side, where hardware resources are often limited. Deep learning approaches pose a unique challenge: while such architectures dominate the field of face recognition with regards to accuracy, they require elaborate, multi-stage computations. Recently, there has been some work on compressing networks for the purpose of reducing run time and network size. However, it is not clear that these compression methods would work in deep face nets, which are, generally speaking, less redundant than the object recognition networks, i.e., they are already relatively lean. We propose two novel methods for compression: one based on eliminating lowly active channels and the other on coupling pruning with repeated use of already computed elements. Pruning of entire channels is an appealing idea, since it leads to direct saving in run-time in almost every reasonable architecture.