Abstract
Vision is the most important way for us to getexternal information. Understanding the information processing mechanism of thebiological visual system can help us to better understand how the brain works.
Humanvisual system has perfect capability for visual information processing and itsintelligence outperforms the best current computer vision system. Therefore, buildingbrain-inspired methods to improve computer Vision applications has attracted seriousconcern.
Althoughthe research of computer vision has made a great progress in recent years,there are still a wide gap between computer visual system and human visionexcept some specific visual assignment, such as target recognition in specific imagelibrary. [1]
Thisthesis will address a series of research status related to the brain-computervision in the world.
Keywords:visual perception brain-computer vision
Introduction
Visionis the most important way for us to get external information. Understanding theinformation processing mechanism of the biological visual system can help us tobetter understand how the brain works.
On the other hand,a large number of information acquisition and processing in the field ofartificial intelligence also depends on the visual signals like video, image and so on. The ultimate goal ofcomputer vision is to establish a visual information processing system asintelligent as biological visual system which can help artificial intelligencesystem to complete the visual signal acquisition and analysis. With thedevelopment of Internet technology, the amount of data which is based on visionis also growing sharply. It includes the photographs taken by mobile phone, thedifferent kinds of video acquired by surveillance equipment, etc. Coding andstoring these massive data effectively as well as mining useful information amajor challenge that computer science faces now. It is hoped that the computercan efficiently feel and understand the world, and be able to handle thecomplex visual information like human beings which sets higherrequirements to all kinds of algorithms of visualinformation processing. Constructing a visual-perception computer is anextremely difficult process which needs to build three-dimensional physicalreal scenes from the two-dimensional image. People, however, can easily finishthese tasks. Although the research of computer vision hasmade a great progress in recent years, there are still a wide gap betweencomputer visual system and human vision except some specific visual assignment,such as target recognition in specific image library.[1]
Currently,the research of vision mainly includes two fields: one is the visualneuroscience for studying the mechanism of information processing of visualsystem; another is computer vision that aims at developing visual informationprocessing method. Neuroscientists concentrates more on the study of neural mechanismof visual information process. About 70% of the information that the brainreceives comes from the visual system, while nearly 40% of the neurons in thebrain are involved in the processing of visual information. At the same time,the biological visual information processing is not carried out in isolation,it is closely related to the advanced cognitive process, such as learning andmemory. Therefore, vision is an important access point and a part of brain science research. In recentdecades, neuroscientists have studied systematically the vision from differentangles and levels, such as neuron level, neural network level, cognition,behavior and so on, and has accumulated rich research data. In order toeffectively integrate experimental data from different levels, computationalneuroscientist, by means of computational modeling from the basic function ofeach level, explore the calculation mode of visual information processing
Visualperception refers to the process of perceiving and understanding the scene byprocessing the information from the external environment by the visual system.The vision system of intelligent beings is a very complex and efficientinformation processing system. It can achieve a series assignments fast andaccurately like image acquisition, detection of objects in images, andperception of distance, relationship and movement of objects.
Human visual systemhas perfect capability for visual information processing and its intelligenceoutperforms the best current computer vision system. Therefore, buildingbrain-inspired methods to improve computer Vision applications has attractedserious concern.
This thesis will addressa series of research status related to the brain-computer vision in the world.
The status quo of brain-inspired computer vision
Thehuman brain is the most efficient information processing system that we haveknown in the world, but we know little about the brain’s information processingmechanism. At present, the scientific research on the brain is the forefront ofnatural science and is also one of the biggest challenges of human itself.Therefore, exploring the mechanism of the brain will make the human moreprofoundly know itself. At the same time, simulating the mechanism of brain anddeveloping intelligent information processing system will boost not only thedevelopment of information industry but a new technological revolution inrecent years. Europe and the United States have put forward the strategic brainresearch program, its key content includes simulating computing pattern of thebrain and developing brain-technique. In the recent years, Chinese brain projectis also put forward. Brian-Inspired computing has become a hot research topicin the field of artificial intelligence.
Atpresent, the efficiency of intelligent information processing in biologicalvisual system (especially the visual system of the higher animals) is farsuperior to any computer vision system existing. As a result, the researchersbegan to consider introducing mechanism of biological visual informationprocessing to develop new computer vision algorithm. In this paper, Brain-Inspired Computer Vision means developing homologouscomputer algorithms to perform relevant image processing or visual tasks likeimage enhancement image segmentation target detection and so on by learning thebiological visual information processing mechanism. The inspiration frombiological vision often provides a simple and efficient solution for manycomputer vision problems. Currently, the experimental study on the mechanism offront-end vision has achieved rich results and scientists also has a thoroughunderstanding about the information processing mechanism. However, themechanism of advanced visual information processing still remains a mystery.Therefore, the research of brain-inspired computer vision started from theprimary visual task and the front-end visual system.
In1970s, Marr put forward the theory of visual computing [4] based on theachievements of research for vision mechanism in the field of psychophysics,neurophysiology and clinical neurology. Marr systematically studied the problemof visual information processing from two aspects: visual representation andvisual processing. And he described the process of visual informationprocessing from three levels: computing theory, computing algorithm andhardware implementation. Marr believed that the human visual system combinesthe various visual representations of the image organically to form theperception of the target and the understanding of the scene. The visualcomputing theory holds that the three stages of vison’s formation include theestablishment of the primal sketch, the establishment of the 2.5-dimensionalsketch and the formation of the three-dimensional target characterization.Although the theory is not perfect (only established theoretically whichignores the work of feedback mechanism), but was established for the first timefrom the visual image information derived from the frame structure of theoutside world, which greatly promoted the development of computer visionresearch
Atypical example of brain-inspired computer vision is Retinex theory [5, 6], atheory that put forward by Land who simulated the retina and visual cortexmechanism in 1960s. The key idea is that the ability people perceive the colorand brightness are not only absolutely depend on the light form the position theystand, but also the color and brightness around them. Based on Land’s Retinextheory, Jobson developed Single-scale Retinex algorithm [7] and Multi-scaleRetinex algorithm [8] successively, and those algorithms have been successfullyapplied to image processing tasks such as image intensification and colorconstancy. In addition, the idea about realizing the adaption in bright or darkenvironment by using cone cells and rod cells is also widely used in imageintensification and high dynamic range image processing [9 – 11].
Besides,in primary visual processing, biological visual receptive field model has beenwidely used in various image processing tasks. For example, the receptive fieldmodel based on the mechanism of the retina and the LGN neurons can be used forcolor correction, enhancing visibility of hazy images [12] [13] and othervisual tasks. And the computing model based on the properties of primary visualcortex receptive field has been successfully applied to image texture analysis aswell as contour detection [14] [15]. Recently, Wei and other scientists establishedretinal neural circuit model. Such model is used to describe the image feature[16, 17]. It was found that the image characteristics encoded by this model canimprove the performance of senior visual task (such as image segmentation,target recognition, etc.). By simulating the mechanism of color opponent-process receptive field, Zhang withhis team established the color character description method which alsosignificantly improves the performance of contour detection and targetrecognition system [18]. In addition, based on the rattlesnake’s receptivefield mechanism of optical information and of the infrared signal, the receptivefield model established by Waxman’s team can be used in optical image andinfrared image in low light environment [19]
Anothersuccessful example is selective attention model. Visual selective attention is animportant means to implement the coding of visual information effectively [2][20]. Based on the feature integration theory put forward by Treisman, Itti’svisual selective attention model has been widely used in the field of imageprocessing [24] [21]. For instance, in JPEG2000 image compression method, usingselective attention model can compressed image and at the same time as much aspossible to retain the important area of image [21]. Selective attention model promotesthe development of advanced computer vision algorithms, such as targetrecognition [25]. Recent research applied saliency model to analysing thebehavior of the autistics [26], which helps people to understand the mechanismof the autistic disorder and provide supplementary treatment of this disease. Saliencycomputing model was also used to analyze the driver’s behavior which hasimportant reference value for the design of self-driving systems [27]. At the same time, visual perception process based on the Gestaltprinciples (Gestalt) or topological perception theory provides a new insightinto the development ofintelligent target perception algorithm [3].
In termsof advanced vision tasks, Poggio and other people put forward HMAX models basedon hierarchical information processing model of vision system which achievegood performance in target recognition tasks of complex scenes [28, 29]. The deeplearning algorithm which is successfully applied in the field of computervision, to some extent, is also using the hierarchical processing method ofvisual system [30 - 34]. Among them, the weights sharing mechanism of convolutionalneural network is also similar to visual receptive field’s. Convolutioncomputing and pool operation in the convolution neural network are directlyinspired by the way of information processing in the visual system of simplecells and complex cell. What is more, the overall structure of convolutionalneural network is similar to the visual processing structure from visualpathway in the cortex such as LGN-V1-V2-V4-IT [34].
At thesame time, the related research is also of great significance to understandingthe computing mode of the brain. With the development of the research onmechanism of the brain, researchers began to consider using a computer to simulatethe brain function and behavior [35]. Recently, Eliasmith and other people putsforward a brain computing model consists of 2.5 million simulated neurons [36].This model can implement multiple vision and learning tasks. The simulation tothe whole brain based on the great progress of early visual model study. Thosecomputing models have irreplaceable scientific value in explaining the braininformation processing mechanism. At the same time, on account of theachievements in the field of cognitive science, developing new computationallearning theories is an important part of artificial intelligence. This paperfocuses on the vision problems in brain-inspired computing, and other parts ofit is beyond the research scope of this paper for which I will introduce littlein this paper.
Summary
Over thepast few decades, especially in the past 10 years, the neuroscience hasobtained the very rapid development. Now we have accumulated a wealth ofknowledge for the working principle of the brain, this provides an importantbiological basis for the development of brain-inspired computer vision. But as forthe mechanism of how neurons with relatively simple function are organizedthrough the network, there are still many problems remain to be resolved.
Thebrain is a complex network consists of nearly billions of neurons. The materialbasis of the implement of feeling, working and other brain functions is orderlytransmission and processingof information in this hugenetwork.
In thefield of algorithms, many people worry about how to create algorithms forbrain-inspired computing when we know little about the working principles of brain.However, the multi-layer and step processing structure in the brain, especiallythe visual pathway, is the basic knowledge that has been acquired inneuroscience. This suggests that we do not have to fully understand theprinciples of the brain to study the algorithms of the brain. On the contrary,what is really enlightening is probably a relatively basic principle. Some ofthese principles may have been known by brain scientists, while others may yetbe discovered. Every basic principle and its successful application inartificial information processing system may bring about progress in brain computingresearch. It is very important that this process of discovering andtransforming can not only promote the development of theartificial intelligence, but also be synchronized to deepen our understandingof how can information be so effective in dealing with the issue about why thebrain [37], thus forming a virtuous circle of a brain science and artificialintelligence technology to promote each other.
Thebrain-inspired computer vision is the highly cross and fusion of life sciences,especially brain science and information technology. The technology includesunderstanding the brain’s visual information processing principle, and developa new algorithm on this basis, and apply it to a new generation of artificialintelligence, human-computer interaction and other fields. Brain-inspiredcomputer vision technology is expected to enable artificial vision informationprocessing system to produce a visual information processing system that is similarto human brain at very low energy consumption. Many people believe that thesubstantial progress in this direction will actually open the prelude to theintelligent revolution and bring a profound change to the social production andlife [38].
References
[1] K. He, X. Zhang, S.Ren, et al. Delving deep into rectifiers: Surpassing human-level performance onImage Net classification[C]. IEEE International Conference on Computer Vision,Santiago, Chile,2015,1026–1034
[2] J. K. Tsotsos, S. M. Culhane, W. Y. K. Wai, et al. Modelingvisual attention via selective tuning[J]. Artificial Intelligence, 1995,78(1):507–545
[3] L. Chen. The topological approach to perceptualorganization[J]. Visual Cognition, 2005,12(4):553–637
[4] D. Marr. 视觉计算理论[M]. 科学出版社, 1988
[5] E. H. Land, J. J. Mc Cann. Lightness and retinex theory[J].Journal of the Optical Society of America, 1971, 61(1):1–11
[6] E. H. Land. An alternative technique for the computation ofthe designator in the retinex theory of color vision[J]. Proceedings of theNational Academy of Sciences, 1986, 83(10):3078–3080
[7] R. Zia. Properties of a center/surround Retinex: part 1.signal processing design[J]. NASA Langley Technical Report Server, 1995
[8] D. J. Jobson, Z.-u. Rahman, G. A. Woodell. A multiscaleretinex for bridging the gap between color images and the human observation ofscenes[J]. IEEE Transactions on Image Processing,1997, 6(7):965–976
[9] K. Devlin. A review of tone reproduction techniques[J].Computer Science, University of Bristol, Technical Report CSTR-02-005, 2002
[10] S. N. Pattanaik, J. Tumblin, H. Yee, et al. Time-dependentvisual adaptation for fast realistic image display[C]. IEEE Conference onComputer Graphics and Interactive Techniques, New Orleans, USA, 2000, 47–54
[11] P. Vangorp, K. Myszkowski, E. W. Graf, et al. A model oflocal adaptation[J]. ACM Transactions on Graphics, 2015, 34(6):166:1–13
[12] X.-S. Zhang, S.-B. Gao, R.-X. Li, et al. A Retinal MechanismInspired Color Constancy Model[J]. IEEE Transactions on Image Processing, 2016,25(3):1219–1232
[13] X.-S. Zhang, S.-B. Gao, C.-Y. Li, et al. A Retina InspiredModel for Enhancing Visibility of Hazy Images[J]. Frontiers in ComputationalNeuroscience, 2015, 9:151,1–13
[14] N. Petkov, M. A. Westenberg. Suppression of contourperception by band-limited noise and its relation to nonclassical receptivefield inhibition[J]. Biological Cybernetics, 2003, 88(3):236–246
[15] J. Malik, P. Perona. Preattentive texture discrimination withearly vision mechanisms[J]. Journal of the Optical Society of America A:Optics, Vision and Image Science, 1990, 7(5):923–932
[16] D. Weng, Y. Wang, M. Gong, et al. DERF: Distinctive EfficientRobust Features From the Biological Modeling of the P Ganglion Cells[J]. IEEETransactions on Image Processing, 2015,24(8):2287–2302
[17] H. Wei, Q. Zuo. A biologically inspired neurocomputingcircuit for image representation[J]. Neurocomputing, 2015, 164:96–111
[18] J. Zhang, Y. Barhomi, T. Serre. A new biologically inspiredcolor image descriptor[C]. European Conference on Computer Vision, Florence,Italy, 2012, 312–324
[19] A. M. Waxman, A. N. Gove, D. A. Fay, et al. Color nightvision: opponent processing in the fusion of visible and IR imagery[J]. NeuralNetworks, 1997, 10(1):1–6
[20] A. M. Treisman, G. Gelade. A feature-integration theory ofattention[J]. Cognitive Psychology,1980, 12(1):97–136
[21] C. Christopoulos, A. Skodras, T. Ebrahimi. The JPEG2000 stillimage coding system: an
overview[J]. IEEE Transactions on Consumer Electronics, 2000,46(4):1103–1127
[22] L. Itti, C. Koch, E. Niebur. A model of saliency-based visualattention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysisand Machine Intelligence, 1998(11):1254–1259
[23] A. Borji, L. Itti. State-of-the-art in visual attentionmodeling[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2013, 35(1):185–207
[24] A. Borji, M.-M. Cheng, H. Jiang, et al. Salient objectdetection: A survey[J]. ar Xiv preprint ar Xiv:1411.5878, 2014
[25] U. Rutishauser, D. Walther, C. Koch, et al. Is bottom-upattention useful for object recogni-tion?[C]. IEEE Conference on ComputerVision and Pattern Recognition, Washington, USA,2004, 37–44
[26] S. Wang, M. Jiang, X. M. Duchesne, et al. Atypical VisualSaliency in Autism Spectrum Disorder
Quantified through Model-Based Eye Tracking[J]. Neuron, 2015,88(3):604–616
[27] T. Deng, A. Chen, M. Gao, et al. Top-down based saliencymodel in traffic driving environment[C]. IEEE International Conference onIntelligent Transportation Systems, Qingdao, China,2014, 75–80
[28] M. Riesenhuber, T. Poggio. Hierarchical models of objectrecognition in cortex[J]. Nature Neuroscience, 1999, 2(11):1019–1025
[29] T. Serre, L. Wolf, S. Bileschi, et al. Robust objectrecognition with cortex-like mechanisms[J].IEEE Transactions on PatternAnalysis and Machine Intelligence, 2007, 29(3):411–426
[30] G. E. Hinton. What kind of graphical model is the brain?[C].International Joint Conference on Artificial Intelligence, Edinburgh, UK,1765–1775
[31] G. E. Hinton, S. Osindero, Y.-W. Teh. A fast learningalgorithm for deep belief nets[J]. Neural Computation, 2006, 18(7):1527–1554
[32] G. E. Hinton, R. R. Salakhutdinov. Reducing thedimensionality of data with neural networks[J]. Science, 2006,313(5786):504–507
[33] Y. Bengio, A. Courville, P. Vincent. Representation learning:A review and new perspectives[J]. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2013, 35(8):1798–1828
[34] Y. Le Cun, Y. Bengio, G. Hinton. Deep learning[J]. Nature,2015, 521(7553):436–444
[35] C. K. Machens, et al. Building the human brain[J]. Science,2012, 338(6111):1156–1157
[36] C. Eliasmith, T. C. Stewart, X. Choo, et al. A large-scalemodel of the functioning brain[J]. Science, 2012, 338(6111):1202–1205
[37]Yamins D L K, DiCarlo J J. Using goal-driven deep learning models to understandsensory cortex[J]. Nature Neuroscience, 2016, 19: 356-365.
[38]Bostrom N. Superintelligence[M]. New York: Oxford University Press, 2014.
- 本文作者: Yuang
- 本文链接: http://www.yuuuuang.com/2018/02/12/A Review of Brian - Inspired Computer Vision /
- 版权声明: 本博客所有文章除特别声明外,均采用 MIT 许可协议。转载请注明出处!