1. 研究目的与意义(文献综述)
1.1 研究目的及意义
动态范围是指现实世界或图像中亮度的最大值与最小值之比。视觉传感器成像的动态范围决定了其对环境中强弱目标的感知范围和能力。然而,高动态范围视觉传感器对材料、器件和光学系统都有较高的要求,技术难度大、成本高昂,短期内难以普及应用。普通数码相机无法完全捕捉自然界高动态范围场景中的光强度级别,从而导致曝光不足或曝光过度图像区域中的像素信息丢失,形成低动态范围(low dynamicrange, ldr)图像。现代数码相机的低动态范围成像传感器是限制相机捕捉自然界高动态范围图像一个主要因素。
高动态范围(high dynamic range, hdr)成像与传统的低动态范围(ldr)成像不同,它可以提供捕获、操纵和显示现实世界真实亮度的高动态范围的能力。近几十年来,hdr成像技术正不断取得发展。例如,对hdr成像日益增长的需求推动了hdr显示器以及用于捕获hdr内容的特殊设备的不断发展。hdr成像有着诸多方面的优点,相比于传统的ldr图像,hdr图像可以携带更符合自然界高动态范围场景的亮度信息,从而提供更加丰富的对比度信息。然而,高质量hdr相机的价格十分高昂。长期以来,摄影师们一直习惯于使用不同曝光度拍摄的多个相同场景ldr图像来合成生成hdr图像。
2. 研究的基本内容与方案
本文的主要研究内容包括一种实现ldr图像到hdr图像重建的神经网络模型和一种应用该模型进行图像重建的嵌入式gpu加速技术。其中,网络模型分为特征提取部分和图像重建部分。训练时,如图2-1所示,首先,将hdr库中的图片裁剪成固定尺寸,便于网络的训练;然后,使用eilertsen[20]中的虚拟相机,模拟从自然界拍照获取ldr图像的过程,从hdr图片中生成对应场景的ldr图像样本。将ldr样本送入神经网络模型中进行训练,通过最小化损失函数来更新网络权重。
可以使用卷积层提取图像尺度信息,池化层可以用来下采样图像,卷积层、池化结构可以达到提取图像多尺度信息的目的,而后再使用转置卷积[23]完成图像的上采样及hdr图像重构。以上方法则是在lalonde[19]、 eilertsen[20]以及endo[21]使用的编码器-解码器(u-net)结构。而经过分析与实验,池化层在完成图像下采样的同时会造成像素信息的不可逆丢失,从而可能会在重建结果中出现伪影及棋盘效应[24]。本文使用一种称之为空洞卷积[25]的卷积结构,其在标准的卷积图中注入空洞,以此来增加感受野。通过使用多个不同空洞设计的空洞卷积[25]结构的堆叠,可以用来代替使用卷积、池化结构来提取图像的多尺度特征,在一定程度上可以避免伪影及棋盘效应[24]。
3. 研究计划与安排
(1)第1-4周:查阅相关文献资料,明确研究内容。确定毕业设计方案,完成开题报告。
(2)第5-6周:了解并复现现有的高动态图像重建方法,并总结各种方法的优缺点。
(3)第7-9周:设计神经网络结构,构建新型模型,并熟悉jetson tx2的工作原理。
4. 参考文献(12篇以上)
|
[1] Paul D. andJitendra M.. Recovering High Dynamic Range Radiance Maps from Photographs. InProc. of SIGGRAPH, 1997, 369–378. [2]Kalantari NK , Ramamoorthi R . Deep high dynamic range imaging of dynamic scenes[J].ACMTransactions on Graphics, 2017, 36(4):1-12. [3]Kronander J, Gustavson S , Bonnet G , et al. Unified HDR reconstruction from raw CFAdata[C]. IEEE International Conference on Computational Photography. IEEE,2013. [4]Banterle F,Ledda P, Debattista K, et al. Inverse tone mapping[J]. In Proc. ofGRAPHITE,2006,349–356. [5]Banterle F,Ledda P, Debattista K, et al. A framework for inverse tone mapping[J]. TheVisual Computer,2007,23(7). [6]Banterle F,Ledda P, Debattista K, et al. Expanding low dynamic range videos for highdynamic range applications[J]. In Proc. of SCCG,2008,33–41. [7]Reinhard E ,Stark M , Shirley P , et al. Photographic Tone Reproduction For Digital Images[C].Conference on Computer Graphics Interactive Techniques. ACM, 2002. [8]Rempel G,Trentacoste M , Seetzen H , et al. LDR2HDR: On-the-fly reverse tone mappingof legacy video and photographs[J]. Acm Transactions on Graphics, 2007,26(3):39. [9]Kovaleski P,Oliveira M. High-quality brightness enhancement functions for real-timereverse tone mapping[J]. Visual Computer, 2009, 25(5):539-547. [10]Kovaleski P,Oliveira M. High-Quality Reverse Tone Mapping for a Wide Range ofExposures[C]. 2014 27th SIBGRAPI Conference on Graphics, Patterns and Images.IEEE, 2014. [11]Akyüz O, FlemingR , Riecke E , et al. Do HDR displays support LDR content?[C]. ACM, 2007:38. [12] Masia B ,Serrano A , Gutierrez D . Dynamic range expansion based on imagestatistics[J]. Multimedia Tools and Applications, 2017, 76(1):631-648. [13] Huo Y , YangF , Dong L , et al. Physiological inverse tone mapping based on retinaresponse[J]. Visual Computer, 2014, 30(5):507-517. [14]Meylan L , Daly S , Süsstrunk,Sabine. The Reproduction of Specular Highlights on High Dynamic RangeDisplays[C]. Color and Imaging Conference, 2006. [15] Didyk P, Mantiuk R, Hein M, et al. Enhancementof Bright Video Features for HDR Displays[J]. Computer Graphics Forum, 2008,27(4):1265-1274. [16] Lvdi W,Li-Yi W, Kun Z, et al. High Dynamic Range Image Hallucination[J]. In Proc. ofEGSR, 2007, 321–326. [17] Ning S, Xu H,Song L, et al. Learning an Inverse Tone Mapping Network with a GenerativeAdversarial Regularizer[J]. 2018 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP), 2018:1383-1387. [18] Zhang J, Lalonde J F. Learning High DynamicRange from Outdoor Panoramas[J]. 2017 IEEE International Conference onComputer Vision (ICCV), 2017:4529-4538. [19] Eilertsen G,Kronander J, Denes G, et al. HDR image reconstruction from a single exposureusing deep CNNs[J]. ACM Transactions on Graphics, 2017, 36(6CD):178.1-178.15. [20] Endo Y,Kanamori Y, Mitani J. Deep Reverse Tone Mapping[J]. ACM Transactions on Graphics, 2017, 36. [21] Kinoshita Y,Kiya H. Deep Inverse Tone Mapping Using LDR Based Learning for Estimating HDRImages with Absolute Luminance[J]. 2019. [22] Zeiler M D,Krishnan D, Taylor G W, et al. Deconvolutional networks[C]. IEEE ComputerSociety Conference on Computer Vision and Pattern Recognition. IEEE, 2010. [23] Odena, etal. Deconvolution and Checkerboard Artifacts[E]. Distill, 2016.http://doi.org/10.23915/distill.00003. [24] Yu F ,Koltun V . Multi-Scale Context Aggregation by Dilated Convolutions[J]. 2015. [25] Kelvin X,Jimmy B, Ryan K, et al. Show, attend and tell: Neural image captiongeneration with visual attention. ICML, 2015, 2048–2057. [26] He K, ZhangX, Ren S, et al. Deep Residual Learning for Image Recognition[C]. IEEEConference on Computer Vision Pattern Recognition. IEEE ComputerSociety, 2016. [27] Huang G, LiuZ, Laurens V D M, et al. Densely Connected Convolutional Networks[J]. 2016. [28] Wang Z,Bovik A C, Sheikh H R, et al. Image Quality Assessment: From Error Visibilityto Structural Similarity[J]. IEEE Transactions on Image Processing, 2004,13(4). [29] Wang Z,Simoncelli E P, Bovik A C. Multiscale structural similarity for image qualityassessment[C]. Signals, Systems and Computers, 2003. [30] Rafat, Mantiuk, Kil, et al. HDR-VDP-2: acalibrated visual metric for visibility and quality predictions in allluminance conditions[J]. ACM Transactions on Graphics, 2011. [31] Aydin T O, Rafa Mantiuk, Seidel H P.Extending Quality Metrics to Full Luminance Range Images[J]. Proceedings ofSPIE - The International Society for Optical Engineering, 2008, 6806. |
