The AI capturing feature for Huawei’s Mate 10 was based on powerful computer algorithms,
and inspired by the way humans perceive visual data. Credit: GRAFVISHENKA/GETTY
Smartphones can use artificial intelligence (AI) to automatically zoom and capture distant moving objects, or to recognize separate elements in a scene, ensuring the best possible photos. The powerful photo-shooting feature of Huawei’s Mate 10, launched in 2017, is based on an object detection technology developed by computer scientists at Nankai University.
AI-powered smartphone cameras are just one example of researchers from Nankai’s College of Computer Science using advanced computer science technologies to solve industrial problems.
The college was established in 2018, but Nankai’s strong tradition in computer science studies started in 1958. The college has built on that experience to address national strategic issues by developing technologies in image processing and analysis, big data analysis and knowledge management, as well as distributed computing and 5G network architectures.
From understanding visual attention to improved image processing
The AI capturing feature for Huawei’s Mate 10 was based on powerful computer vision algorithms developed at the college’s Media Computing Lab led by a young professor, Cheng Mingming. “We were inspired by how humans perceive visual data, which is one of the most important foundations for our intelligent activities,” said Cheng. “And we focused on salient object detection, a fundamental computer vision problem.”
While even a child can effortlessly pinpoint the most important object in an image, teaching machines to understand visual images is not easy. Most computer vision algorithms require large amounts of precise image annotations to train machines. By efficiently analyzing the global structure of images, Cheng’s team broke the performance bottleneck, enabling machines to automatically extract the most noticeable object in an image.
The salient object detection technique developed by Cheng’s team has led to improved technologies for object detection, image retrieval, weakly supervised learning, knowledge discovery, and image manipulation. Their work, published in IEEE Transactions on Pattern Analysis and Machine Intelligence, has formed the basis for experiments in computer vision and computer graphics, and received 2,700-plus citations according to Google Scholar data. As well as use in smartphones, Cheng’s computer vision technologies are used by Tencent’s QQ Space, a social media platform, for image processing. Applications in healthcare and education are also being developed.
At Nankai’s Intelligent Computing System Lab, professors Li Tao and Wang Kai are creating AI-powered image analysis techniques for disease diagnosis. They have developed deep-learning techniques to analyze retinal fundus images — which include the vessel, the optic disc, lesions and the macula. Their technique has enabled accurate identification and segmentation of the optic disc and cup from fundus images, helping with the diagnosis of glaucoma. The automated analysis of fundus images will also enable grading of diabetic retinal lesions and diagnosis of age-related macular degeneration.
Nankai's Computer Vision Lab, led by computer science professor, Yang Jufeng, has constructed a diagnosis system for processing clinical skin disease images. Based on accepted dermatological criteria, they have designed medical image representations that can effectively capture the manifestation of skin lesions, and improve disease prediction.
Vision computing technologies developed at Yang’s team also include algorithms for enhancing images. One challenge in image enhancement is highlighting the impressive characteristics of an image while simultaneously rendering its important details. As various users and situations may require different processing styles, Nankai computer vision scientists proposed a multimodal image enhancement framework that encodes impressive characteristics of visually appealing images into a meta-space, resulting in multiple image candidates with diverse impressive characteristics.
To do this, they disentangled the style and content codes of images using an encoder-decoder strategy. The style codes were then mapped to the characteristic meta-space, each base of which represents a specific aesthetic characteristic extracted from a set of images. In testing, one can randomly interpolate a characteristic from the meta-space and create an enhanced result. Experiments show that the framework performs favourably against state-of-the-art methods in terms of visual realism, diversity, and aesthetic measures.
Improving distributed computing technologies
Search engines are important internet entry points. Improving their key algorithms for data organization and processing is essential for increased efficiency. The Parallel and Distributed Software Lab (PDSL), led by professors Wang Gang and Liu Xiaoguang, has cooperated closely with China’s search engine giant, Baidu, to develop solutions to complex challenges in search engines. Nankai researchers from PDSL have developed an efficient cache algorithm for large inverted indexes. The algorithm was applied in the online search engine systems of Baidu, leading to significantly improved performance. It has shortened the wait for search results, improving the online experience.
The growth of the internet has led to increasingly large data centres with high operating costs. Improving their resource utilization and reducing operating costs is essential. PDSL members have worked with China’s big IT companies to explore how to balance loads and schedule traffic. They have proposed an online load balancing strategy, which, while ensuring the quality of online services, has obvious advantages over traditional load balancing algorithms. The team’s results are now used in large data centres, saving more than 10 million RMB in operational costs.
Another contribution by PDSL members is the development of an intelligent fault prediction system for data centres. By predicting failures and addressing them proactively, we can fundamentally improve reliability and reduce costs. Using SMART (Self-Monitoring, Analysis, and Reporting Technology) monitoring data and machine learning methods, the most frequent hard disc failures in large data centres may be predicted with an accuracy rate higher than 95%, and a false alarm rate of 0.01% or even less. Several major IT companies, including Baidu, have adopted this technology, which can also be extended to predicting similar hardware/software faults.
Growing mobile multimedia traffic, and the increasing demand for improved multimedia service by mobile users, have challenged current wireless network bandwidth and services. To meet the requirements for low latency, low overhead, and high throughput, the Computer Networks and Information Security team, led by Nankai professor, Xu Jingdong, has designed a series of content caching and forwarding schemes for fifth-generation (5G) networks.
In recent years, the concept of edge-caching was introduced, which, by caching and forwarding content at the edge of networks, is expected to alleviate the burden of increasing data transmission. In an edge-caching service framework for 5G networks, popular multimedia content is placed in central processing units (CU) and distributed units (DU), and users' requests for specific content will be redirected from remote content servers to closer CUs and DUs to reduce service response latency and network service overhead.
With the increasing volume and types of popular multimedia content, reducing the cost of service without jeopardizing service quality has become a hot research topic.
Focusing on resource management in edge-caching for 5G networks, Xu’s team has designed frameworks for CU-DU hierarchical edge-caching, as well as self-organizing edge-caching for mobile devices. They have also worked on content deployment and request distribution strategies. Their results have the potential to meet the needs of the multimedia service industry for content caching, which will lead to important economic and social benefits.
Big data and knowledge management
Thanks to the Internet of Things and cloud computing technologies, enormous amounts of data are produced every day. Big data analysis and knowledge management have become essential research topics.
A research team, led by Yuan Xiaojie and Yang Zhenglu, professors from Nankai’s College of Computer Science, has focused on multi-source heterogeneous big data analysis and integration, along with knowledge extraction and management. They have come up with a variety of world-leading entity linking algorithms for multi-source heterogeneous data analysis, knowledge extraction completion framework, and text semantic understanding models. Their results have been cited and followed up by many renowned scholars worldwide, including Association for Computing Machinery (ACM) members.
Another focus of the team is to analyze and process massive data streams in real time to provide intelligent decision support. They conduct scientific data modelling for common problems in real life, develop new algorithms, and solve key problems such as real-time tracking, prediction of hot topics and typical events, trend matching, and pattern matching of complex events. Their research has found applications in mobile communications, intelligent transportation, and social networking.
Seeing the growing demand for analyzing large volumes of complicated medical data to identify valuable information, Yuan, Yang and their colleagues are investigating the junction of big data management and analysis technology to efficiently and accurately process medical big data and assist decision-making.
They have made breakthroughs in precision medicine and clinical data analysis by using deep learning to analyze large-scale DNA methylation data to achieve non-invasive early lung cancer screening. They also use machine learning methods to improve clinical data quality, providing guidance to patient admission analysis and chronic disease development.
The team has also developed a series of models and algorithms for graph data management and in-depth analysis, based on data management theory, machine learning, and deep learning techniques. These models solve the problem of internet-specific group identification and association analysis, social network information dissemination and prediction, network traffic dynamics perception, and situation prediction.