Portrait matting is a significant task in the field of image processing. To address the issue of rough portrait extraction caused by the diverse scales of human figures in existing image data, this study proposes a dual-pyramid encoded portrait semantic-aware automatic matting network. The dual-pyramid encoder consists of an input pyramid and a feature pyramid. In the input pyramid, the input image is proportionally downsampled and fed into the network to preserve the original image details. The feature pyramid combines banded convolution groups and five levels of encoding blocks to fully capture image features at different levels. Meanwhile, in the dual-branch decoder structure, a field expansion module is designed in the global segmentation decoding branch to expand the network’s receptive field, further enhancing its ability to capture global contextual information. In the local detail branch, a detail-aware module is proposed to fuse encoded features with decoder output, guiding the network to focus on portrait contours. A comparative analysis is conducted to evaluate the performance of six automatic portrait matting methods on three datasets. The results demonstrate that the proposed method exhibits superior matting performance compared to the other methods. This validates the effectiveness of the proposed method in enhancing the precision and robustness of portrait extraction in complex image data.