Identifying urban functional zones is of great significance for understanding urban structure and urban planning. The rapid growth and open accessibility of multisource big data, including remote sensing imagery and social sensing data, lead to a new way for dynamic identification of urban functional zones. In this article, we propose an SOE (scene– object– economy) based learning framework which integrates scene features from remote sensing imagery, object features from building footprints, and economy features from POIs (points of interest). From these three perspectives, rich information hidden in urban zone is excavated for function identification. Convolutional neural networks are used to extract high-level scene information from remote sensing images with different resolutions. Object features comprising a series of building indicators are constructed by measuring the area, perimeter, floor number, and year of the building. Moreover, we extract socioeconomic characteristics from POIs, which reflect different types of human activities in the urban zone. Last, random forest is used to identify functional zones based on SOE features. We apply the SOE-based framework to Shenzhen datasets and achieve 90.8% in accuracy with remote sensing images of 0.3-m spatial resolution. The experimental results show that the predicting performance of SOE-based framework is significantly better than other traditional methods, and the quantitative contribution of SOE factors is also revealed in determining functionality of urban zones.