2018 ÌìÏ¡°ÊÓ¾õÓëѧϰÇàÄêѧÕß×êÑлᡱ£¨VALSE 2018£©4ÔÂ20ÈÕÓÚ´óÁ¬À¿ªá¡Ä»¡£¡£¡£¡£¡£¡£¸Ã×êÑлáµÄÖ÷ҪĿµÄÊÇΪÅÌËã»úÊÓ¾õ¡¢Í¼Ïñ´¦Öóͷ£¡¢Ä£Ê½Ê¶±ðÓë»úеѧϰÑо¿ÁìÓòÄÚµÄÖйúÇàÄêѧÕßÌṩһ¸öÉîÌõÀíѧÊõ½»Á÷µÄÎę̀£¬£¬£¬£¬£¬Ôö½øº£ÄÚÇàÄêѧÕßµÄÍ·ÄÔ½»Á÷ºÍѧÊõÏàÖú£¬£¬£¬£¬£¬ÌáÉýÖйúѧÕßÔÚAIÁìÓò×ö³öÖØÁ¿¼¶µÄѧÊõТ˳£¬£¬£¬£¬£¬¼°ÆäÔÚ¹ú¼ÊѧÊõÎę̀ÉϵÄÓ°ÏìÁ¦¡£¡£¡£¡£¡£¡£
¾Û»áʱ´ú£¬£¬£¬£¬£¬À´×Ôº£ÄÚ¸÷´ó¸ßУÅÌËã»úÊÓ¾õÁìÓòȨÍþѧÕß¡¢º£ÄÚÈ˹¤ÖÇÄÜÁìÓòµÄ¿Æ¼¼¹«Ë¾×¨¼Ò´ú±íºÍ»¥ÁªÍø×ÅÃûÆóÒµ°¢Àï°Í°Í¡¢°Ù¶È¡¢µÎµÎµÈÆë¾ÛÏÖ³¡£¡£¡£¡£¡£¡£¬£¬£¬£¬£¬Õ¹Ê¾AIÒµÄÚÇ°ÑØÊÖÒÕ¡¢Éî¶ÈѧϰӦÓÃЧ¹û£¬£¬£¬£¬£¬²¢Î§ÈÆ´ËÕö¿ªÉîÈë̽ÌÖ¡£¡£¡£¡£¡£¡£
xcsports(ÖйúÇø)-¹Ù·½ÍøÕ¾ÖÇÄÜÉò´º»ª½ÌÊÚÊÜÑû×÷Ϊ´ó»á±¨¸æÈË£¬£¬£¬£¬£¬ÔÚÊÓ¾õÓëÓïÑÔרÌâÌÖÂÛ»áÉÏ£¨VALSE Workshop on Vision and Language£©×÷ÁËÒ»³¡¹ØÓÚÊÓ¾õÎÊ´ð£¨Visual Question Answering£¬£¬£¬£¬£¬ÒÔϼò³ÆÎªVQA£©µÄ±¨¸æ¡£¡£¡£¡£¡£¡£ÊÓ¾õÎÊ´ð/¶Ô»°ÊÇÒ»¸öÈÚºÏÅÌËã»úÊÓ¾õºÍ×ÔÈ»ÓïÑÔ´¦Öóͷ£µÄ×îÖÕʹÃü£¬£¬£¬£¬£¬Ïà½ÏÁ¿Í¼Æ¬Îı¾ÐÎòʹÃü(ImageCaptioning)£¬£¬£¬£¬£¬VQA¸üÄÜ·´Ó¦³ö¹ØÓÚͼÏñµÄÉî²ãÃ÷È·ÄÜÁ¦¡£¡£¡£¡£¡£¡£±¨¸æÖ÷Òª°üÀ¨ÒÔÏÂÄÚÈÝ¡£¡£¡£¡£¡£¡£

xcsports(ÖйúÇø)-¹Ù·½ÍøÕ¾ÖÇÄÜÉò´º»ª½ÌÊÚ½ÒÏþÖ÷ÌâÑݽ²
ÈÚºÏ֪ʶ¿âµÄVQAÄ£×Ó¿ò¼ÜÒÔ¼°»ùÓÚ֪ʶµÄVQAÊý¾Ý¼¯£¨FVQA£©
VQAʹÃüͨ³£µÄ×ö·¨ÊÇʹÓþí»ýÉñ¾ÍøÂ磨CNN£©»ñµÃµÄÌØÕ÷»òÕ¹Íû³öµÄÊôÐÔÍŽáÕë¶ÔͼƬµÄÎÊÌâÒ»Æð×÷ΪµÝ¹éÍøÂ磨RNN,LSTM,GRUµÈ£©µÄÊäÈ룬£¬£¬£¬£¬È»ºóÌìÉúÃյס£¡£¡£¡£¡£¡£¿ÉÊÇÄ¿½ñ±£´æµÄVQAʹÃüµÄÄ£×Ó¿ò¼ÜÓÉÓÚֻ˼Á¿ÁËͼƬµÄÊÓ¾õÐÅÏ¢£¬£¬£¬£¬£¬²¢²»¿ÉºÜºÃµØ»Ø¸²¸üÉî²ãµÄÎÊÌ⣬£¬£¬£¬£¬Òª»Ø¸²ÕâЩÉî²ãÎÊÌâ¾ÍÐèÒªÒ»Ð©ÌØÁíÍâ֪ʶ×÷Ϊ֧³Ö¡£¡£¡£¡£¡£¡£
Éò½ÌÊÚÍŶÓÌá³öÒ»¸öÈÚºÏÁË֪ʶ¿âµÄÄ£×Ó¿ò¼Ü£¬£¬£¬£¬£¬Äܹ»ÍŽáͼƬËù°üÀ¨µÄÐÅÏ¢ºÍÌØÁíÍâ֪ʶ¿â[1,2,3]¡£¡£¡£¡£¡£¡£ÁíÍ⣬£¬£¬£¬£¬Ä¿½ñµÄVQAµÄʹÃüûÓиø³öÎÊÌâÃÕµ×µÄÔµ¹ÊÔÓÉ£¬£¬£¬£¬£¬Ê¹µÃÃյײ»¿É×·ËÝͼƬµÄÏà¹ØÌØÕ÷ÒÔ¼°Ïà¹ØÖªÊ¶¡£¡£¡£¡£¡£¡£Éò½ÌÊÚÍŶÓÌá³öÁËÒ»¸öVQA-MachineµÄ¿ò¼Ü[4]£¬£¬£¬£¬£¬ÈÚºÏÁ˶àÖÖÅÌËã»úÊÓ¾õʹÃüµÄЧ¹û²¢ÇÒÄÜÌØÊâÊä³ö¸ø³öÃÕµ×µÄÀíÓÉ¡£¡£¡£¡£¡£¡£¿£¿£¿£¿£¿£¿£¿£Ë¼Á¿µ½Ä¿½ñµÄVQAÊý¾Ý¼¯µÄͼƬÑù±¾ËùÌṩÎÊÌâºÍÃÕµ×¶¼¹ýÓÚdzÒ×£¬£¬£¬£¬£¬Éò½ÌÊÚÍŶÓÌá³öµÄÒ»¸ö»ùÓÚ֪ʶÊÂʵµÄеÄVQAÊý¾Ý¼¯£¨FVQA£©[5]£¬£¬£¬£¬£¬¹ØÓÚÿÕÅͼ»¹»áÌØÊâÌṩÓëͼƬÎÊÌâÏà¹ØµÄÊÂʵ»ù´¡¡£¡£¡£¡£¡£¡£
ÀàÈ˶Ի°ÌìÉú
˼Á¿µ½ÏÖÔÚµÄVQAʹÃüµÄÊä³ö½ÏÁ¿»úе»¯£¬£¬£¬£¬£¬¼ò¶Ì£¬£¬£¬£¬£¬±¨¸æÀﻹÏÈÈÝÁËÆäÍŶÓÌá³öµÄÒ»ÖÖʹÓöԿ¹Ñ§Ï°£¨GAN£©ÒÔ¼°Ç¿»¯Ñ§Ï°¸¨ÖúÌìÉúÔ½·¢×ÔÈ»µÄÀàÈ˵ÄÓïÑÔµÄÒªÁì[6]¡£¡£¡£¡£¡£¡£¸ÃÒªÁìʹÓÃÁËÒ»ÖÖÈÚºÏͼƬ¡¢ÎÊÌâºÍÀúÊ·ÎÊ´ðÊý¾ÝµÄÍŽá×¢ÖØÁ¦µÄ±àÂëÆ÷£¨Co-attention encoder£©×÷Ϊһ¸öÌìÉúÆ÷£¨generator£©ºÍÒ»¸öʹÓÃÀúÊ·ÌìÉúÆ÷Ó°ÏóµÄÅбðÆ÷£¨discriminator£©À´Çø·Ö»úеÌìÉúµÄ¶Ô»°ºÍÀàÈ˵ĶԻ°¡£¡£¡£¡£¡£¡£
²Î¿¼ÎÄÏ×£º
[1] Image Captioning and Visual QuestionAnswering Based on Attributes and External Knowledge. Wu & Wang et al.TPAMI 2017
[2] Ask Me Anything: Free-Form VisualQuestion Answering Based on Knowledge from External Sources. Wu & Wang etal. CVPR2016
[3] What Value Do Explicit High-LevelConcepts Have in Vision to Language Problems. Wu et al. CVPR 2016
[4] The VQA-machine Learning How to UseExisting Vision Algorithms to Answer New Questions. Wang & Wu et al. CVPR2017
[5] FVQA: Fact-Based Visual QuestionAnswering. Wang & Wu et al. TPAMI 2018
[6] Are You Talking to Me? Reasoned VisualDialog Generation Through Adversarial Learning. Wu & Wang et al. CVPR 2018
xcsports(ÖйúÇø)-¹Ù·½ÍøÕ¾ÖÇÄÜʼÖÕÇ×½ü¹Ø×¢×ÅÒµÄÚÇ°ÑØ¶¯Ì¬£¬£¬£¬£¬£¬Í¬Ê±Ê®·ÖÖØÊÓ¶ÔÇàÄêѧÕßÕâÒ»ÐÂÉúʵÁ¦µÄ×÷Óý£¬£¬£¬£¬£¬×÷Ϊ±¾´Î´ó»áµÄ²¬½ðÔÞÖúÉÌÖ®Ò»£¬£¬£¬£¬£¬ÓÉCEOÓÝÕý»ª²©Ê¿´ø¶Ó£¬£¬£¬£¬£¬Ïò²Î»áµÄÁÐλÇàÄêѧÕß¼°ÒµÄÚ¹«Ë¾´ú±íÏÈÈÝÁËxcsports(ÖйúÇø)-¹Ù·½ÍøÕ¾ÖÇÄܵÄÍŶÓÇéÐΡ¢½¹µãÓÅÊÆ¼°×îÐÂÑо¿Ð§¹û¡£¡£¡£¡£¡£¡£ÔÚÕ¹Ê¾Çø£¬£¬£¬£¬£¬ÎÒÃÇ×ÅÖØÕ¹Ê¾ÁËxcsports(ÖйúÇø)-¹Ù·½ÍøÕ¾ÖÇÄܵĽ¹µãÊÖÒÕÓÅÊÆÓë×îвúÆ·»ùÓÚǶÈëʽÉî¶ÈѧϰµÄǰÊÓADASÒ»Ìå»ú²úÆ·£¬£¬£¬£¬£¬ÎªÆÚÈýÌìµÄչʾÖУ¬£¬£¬£¬£¬×¤×ãä¯ÀÀµÄ²Î»áÖ°Ô±ÂçÒïÒ»Ö±¡£¡£¡£¡£¡£¡£
ÎÒÃǺÜÊÇ»£»£»£»£»£»£»£½Ó´ýÓÅÒìѧ×ӵļÓÈ룬£¬£¬£¬£¬ ÔÚō֨µÄѧÊõÆø·ÕÍŶÓÖмÓËÙʵ¼ùÊÖÒÕµÄÌáÉý£¬£¬£¬£¬£¬Ò»ÆðʵÏÖÆû³µ¹¤ÒµÈ˹¤ÖÇÄÜ»¯£¬£¬£¬£¬£¬¿ªÆôÆû³µµÄÈ˹¤ÖÇÄÜ´óÄÔ¡£¡£¡£¡£¡£¡£