Abstract:An optimized bilinear structure based on ResNet34, termed OBSR-Net, is proposed for more accurate and quick facial expression recognition. OBSR-Net adopts a bilinear network structure as its overall framework and incorporates ResNet34 as the backbone network to model the local paired feature interaction by translation invariance, to extract more complete and effective features. At the same time, transfer learning mitigates the limitations imposed by small sample image data sets of facial expressions on deep learning. In addition, gradient concentration, a new general optimization technique, is utilized during the training process. This technique operates directly on gradients by concentrating gradient vectors to zero mean, which can be regarded as a projected gradient descent method with a constrained loss function. Experiments on two public datasets, namely Fer2013 and CK+, reveal that OBSR-Net achieves recognition accuracy of 77.65% and 98.82%, respectively. The experimental results show that OBSR-Net is more competitive than other advanced facial expression recognition methods.