Deep Convolution Neural Network (DCNN) is a popular scene classification method. However, with the neural network deeper and wider, the difficulty of training network also increases. Some scholars have proposed crop images randomly to reduce the difficulty of network training, which will reduce the relevance of the cropped image to the label. To solve this, we propose a scene classification algorithm based on adaptively regional supervision, which is constructed by three parts: heat map generation layer, adaptively supervised cropping layer, and classification layer. The algorithm generates heat map for each image, adaptively crops the image based on the heat map, and finally classifies the cropped images, which improves the relevance of the cropped images to labels. Experiments on the 15-Scene and MIT Indoor datasets show our algorithm outperforms the original network architecture in training efficiency and recognition performance, which shows the accuracy and robustness of our algorithm.