optimizing visual search reranking via pairwise learning
数字经济背景下高新技术企业研发投入优化对策研究
I G I T C W经验 交流Experience Exchange188DIGITCW2024.04高新技术企业作为创新驱动发展战略的主力军,在促进科技成果转化、提高企业自主创新能力方面发挥着重要作用。
同时,我国数字经济高速发展,以数字技术及其应用为代表的新动能推动着数字与传统生产要素的深度融合,促进传统产业转型升级。
因此,在数字经济发展的影响下,高新技术企业进一步优化研发投入模式,进而提升研发投入转换效率,增强企业核心竞争力显得尤为重要。
1 核心概念及理论基础1.1 高新技术企业高新技术企业是指企业中科技型人才占比大,生产经营过程中研发投入费用占比大,需要不断对产品和服务进行研究开发和技术成果转化的企业。
注重研发投入及人才培养,不断追求创新,获取较高的收益是高新技术企业区别于一般制造企业的关键因素。
近年来,我国高新技术企业发展迅速,截至2022年底,已突破27.5万家。
1.2 研发投入研发投入是企业研究开发新产品或新技术而产生的各种支出,具体包括技术开发人员的劳务报酬、研发活动消耗的原材料、设备维修费用等。
由于研发投入成果转化及预期收益具有不确定性,同时企业为开发新技术而注入的资金及人才成本不可逆,因此企业管理者和业界学者都非常关注研发投入的有效性。
当前学者们关于研发投入对企业经营绩效的影响持有不同观点,如促进作用、不确定性作用、滞后作用和异质性作用等[1]。
而对高新技术企业研发投入影响机制的研究结论大多证实了合理的研发投入对企业创新绩效、经营绩效有显著的促进作用[2]。
1.3 数字经济Don Tapsctt (1996)首次提出了数字经济的概念,之后学者们从数字经济的要素、目标、载体等方面进行了深入解读。
数据、信息技术和产业是数字经济的三大要素:随着传统经济向网络化和数据化的转型,数据已经成为企业的核心资源;云计算、物联网、人工智基金项目:山西省高等学校哲学社会科学项目“高新技术企业研发投入影响机制研究”(2022W122)。
高校校园绿色改造评价体系框架整合优化研究
城市 环境 设计 学术 146 | 12 | 2023364摘要:既有高校校园绿色改造能有效推动我国绿色校园的发展。
然而我国尚缺乏针对性的评价标准,相关评价与指导需由多个评价标准所组成的高校校园绿色改造评价体系协调配合完成,整合优化体系框架能使其协调更顺畅、评价更科学。
本文通过对比国外的LEED 、Green Mark 、STARS 、Green Metric 及我国相关评价标准体系的发展演变、关联衔接、编制机构等方面的典型特征,提出整合优化我国体系框架的思路:协调框架形成逻辑关系和对政策的响应;增设《绿色校园评价标准》的中间框架层级;插入“模块式”地域性因素框架;利用信息化平台优化对标准编制的统筹等,为推动高校校园绿色改造工作提供参考。
Abstract: Retrofitting the existing higher education institutions' campuses into green campuses can effectively promote the development of green campus in China. However, there is still a lack of specific assessment standards in China, relevant evaluation and guidance should be coordinated by the evaluation system made up of Assessment Standard for Green Campus, Assessment Standard for Green Building, and Assessment Standard for Green Retrofitting of Existing Building. Integration and optimization of the framework of the above standards can make their coordination smoother and evaluation more scientific. This paper compares the typical features of LEED, Green Mark, STARS, Green Metric, and other foreign systems, and the related standards of the evaluation system in China in terms of the development, evolution, correlation, and influencing factors of the framework, and puts forward the idea of integrating and optimizing the evaluation system framework for green retrofitting of higher education institutions' campus in China as follows: coordination framework logic formation and response to policy; add the intermediate frame level of Assessment Standard for Green Campus; Insert "modular" regional factor framework; use information platform to optimize the overall planning of standard compilation. To provide a reference for promoting the green retrofitting of higher education institutions' campuses.关键词:高校校园;绿色改造;评价标准;体系框架Keywords: higher education institutions' campuses; green retrofitting; evaluation standard; system framework刘金日宋昆陈国瑞赵迪*LIU Jinri SONG Kun CHEN Guorui ZHAO Di*文章编号: 1672-9080(2023)12-0364-08DOI : 10.19974/ 21-1508/TU.2023.12.0364中图分类号:TU 984.14文献标志码:A收稿日期: 2022-09-08修回日期: 2023-03-08基金项目: 国家自然科学基金资助项目“基于多目标优化的既有高校校园绿色化改造评价与设计方法研究―以京津冀地区为例”(批准号:52078325)作者简介刘金日,硕士,男,同济大学建筑设计研究院(集团)有限公司;天津大学建筑学院硕士。
弟弟写作业的建议英语
1.Create a Schedule:Encourage your brother to create a study schedule that includes specific times for homework.This helps in developing a routine and ensures that homework is a priority.2.Choose a Suitable Environment:Suggest that he finds a quiet and comfortable place to do his homework.A welllit,clutterfree space can improve focus and productivity.3.Break Down Tasks:If the homework seems overwhelming,advise him to break it down into smaller,manageable tasks.This can make the work seem less daunting and more achievable.e Tools and Resources:Recommend using tools like dictionaries,thesauruses,and educational websites to assist with understanding and completing assignments.5.Stay Organized:Encourage the use of a planner or digital app to keep track of assignments,due dates,and tests.This can help in managing time effectively.6.Take Regular Breaks:Suggest taking short breaks after completing a section of work. This can prevent burnout and maintain a fresh mind for continued studying.7.Ask for Help When Needed:Its important for your brother to know that its okay to ask for help from teachers,parents,or tutors if hes struggling with a particular subject or concept.8.Practice Active Reading:For assignments that involve reading,encourage him to take notes,underline important points,and ask questions to improve comprehension.9.Revise Regularly:Advise him to review completed homework and class notes regularly to reinforce learning and prepare for tests.10.Stay Motivated:Help him set goals and rewards for completing homework.This can make the process more enjoyable and motivating.11.Avoid Procrastination:Encourage starting homework as soon as possible after school to avoid lastminute stress and rushed work.e Technology Wisely:While technology can be a great tool for learning,its also a common distraction.Suggest using apps that limit distractions or setting specific times for using digital devices.13.Develop Good Writing Habits:Encourage clear and organized writing,as well as proofreading to ensure the quality of the work.14.Stay Healthy:Good health contributes to better focus and energy levels.Advise maintaining a balanced diet,regular exercise,and adequate sleep.15.Reflect on Progress:Encourage your brother to reflect on his homework habits and identify areas for improvement.This selfawareness can lead to better study strategies.16.Be Patient and Persistent:Learning takes time,and everyone has different paces. Remind him that its okay to take his time and to keep trying even when things get tough.17.Engage in Group Study:Sometimes studying with classmates can provide different perspectives and make learning more interactive.18.Seek Feedback:Encourage him to ask for feedback on his work from teachers or peers to understand areas of strength and improvement.19.Keep Materials Handy:Having all necessary materials like pens,paper,textbooks, and notes organized and ready to use can save time and reduce stress.20.Celebrate Achievements:Recognize and celebrate his efforts and achievements,no matter how small,to boost his confidence and motivation.。
语义分析的一些方法
语义分析的一些方法语义分析的一些方法(上篇)•5040语义分析,本文指运用各种机器学习方法,挖掘与学习文本、图片等的深层次概念。
wikipedia上的解释:In machine learning, semantic analysis of a corpus is the task of building structures that approximate concepts from a large set of documents(or images)。
工作这几年,陆陆续续实践过一些项目,有搜索广告,社交广告,微博广告,品牌广告,内容广告等。
要使我们广告平台效益最大化,首先需要理解用户,Context(将展示广告的上下文)和广告,才能将最合适的广告展示给用户。
而这其中,就离不开对用户,对上下文,对广告的语义分析,由此催生了一些子项目,例如文本语义分析,图片语义理解,语义索引,短串语义关联,用户广告语义匹配等。
接下来我将写一写我所认识的语义分析的一些方法,虽说我们在做的时候,效果导向居多,方法理论理解也许并不深入,不过权当个人知识点总结,有任何不当之处请指正,谢谢。
本文主要由以下四部分组成:文本基本处理,文本语义分析,图片语义分析,语义分析小结。
先讲述文本处理的基本方法,这构成了语义分析的基础。
接着分文本和图片两节讲述各自语义分析的一些方法,值得注意的是,虽说分为两节,但文本和图片在语义分析方法上有很多共通与关联。
最后我们简单介绍下语义分析在广点通“用户广告匹配”上的应用,并展望一下未来的语义分析方法。
1 文本基本处理在讲文本语义分析之前,我们先说下文本基本处理,因为它构成了语义分析的基础。
而文本处理有很多方面,考虑到本文主题,这里只介绍中文分词以及Term Weighting。
1.1 中文分词拿到一段文本后,通常情况下,首先要做分词。
分词的方法一般有如下几种:•基于字符串匹配的分词方法。
此方法按照不同的扫描方式,逐个查找词库进行分词。
如何管好生活秩序英语作文
Living a wellordered life is a skill that many strive for but few master. Its not just about keeping a clean house or a tidy schedule its about creating a lifestyle that supports your goals and wellbeing. Heres my take on how to manage lifes order, drawing from my personal experiences and observations.Embracing RoutineA structured routine is the backbone of an orderly life. Ive found that starting my day with a consistent morning ritual sets the tone for the rest of the day. Whether its a quick workout, a healthy breakfast, or a few moments of meditation, these activities help me feel prepared and focused. Its the predictability of routine that provides a sense of control over the chaos of life.Prioritizing TasksUnderstanding whats truly important is crucial. Ive learned to prioritize tasks based on urgency and importance, a method popularized by Stephen Covey. By categorizing tasks into four quadrants, I can focus on what truly matters and avoid getting bogged down by less critical activities. This approach has saved me countless hours and reduced unnecessary stress.Time ManagementEffective time management is key to maintaining order. I use a planner to map out my week, allocating time for studies, hobbies, and relaxation.Apps like Google Calendar or Todoist have also been instrumental in keeping me on track. By visualizing my commitments, I can better manage my time and avoid overcommitting.DeclutteringPhysical clutter can lead to mental clutter. Ive made it a habit to declutter my space regularly. Whether its a quick tidyup after school or a deep clean on weekends, a clean environment promotes clarity and focus. Marie Kondos concept of keeping only items that spark joy has been particularly influential in my approach to decluttering.Financial OrganizationManaging finances is another aspect of life that requires order. Ive started using budgeting apps to track my expenses and savings. This has helped me understand my spending habits and make more informed financial decisions. Setting financial goals and having a clear plan for achieving them has brought a sense of stability and security.Digital MinimalismIn todays digital age, managing the digital clutter is just as important as physical. Ive made a conscious effort to reduce screen time and limit my use of social media. By doing so, Ive found more time for meaningful activities and less distraction from my goals.SelfCareLastly, maintaining order in life is not just about external factors its also about internal wellbeing. Ive learned the importance of selfcare, whether its through regular exercise, a balanced diet, or simply taking time to unwind with a good book or a walk in nature. Taking care of my mental and physical health has a profound impact on my ability to manage lifes chaos.ConclusionIn conclusion, managing lifes order is a multifaceted endeavor that involves routine, prioritization, time management, decluttering, financial organization, digital minimalism, and selfcare. Its a continuous process of refinement and adaptation. By implementing these strategies, Ive found a greater sense of control and peace in my life. Its not about achieving perfection but about creating a life that supports your goals and allows you to thrive.。
描写作业的英语
Homework is an essential part of the educational process,designed to reinforce and deepen the understanding of concepts taught in class.It can take various forms,such as written assignments,reading tasks,practical exercises,or research projects.Heres a detailed description of the different aspects of homework in English:1.Purpose of Homework:Homework serves multiple purposes.It helps students to practice and apply the knowledge gained in class,develop time management and organizational skills,and foster a sense of responsibility and selfdiscipline.2.Types of Homework:Written Assignments:These include essays,reports,and problemsolving tasks that require students to write down their thoughts and solutions.Reading Assignments:Students are often asked to read chapters from textbooks or articles to prepare for class discussions or to gain a deeper understanding of a topic. Math Problems:In subjects like mathematics,students are given problems to solve, which helps them to apply mathematical concepts and improve their problemsolving skills.Science Experiments:Sometimes,students are asked to conduct experiments at home or in the lab to explore scientific concepts further.Research Projects:These are indepth assignments where students are expected to explore a topic extensively,gather information,and present their findings.3.Homework Guidelines:Teachers usually provide guidelines for completing homework, which may include due dates,formatting requirements,and the level of detail expected in the answers.4.Strategies for Effective Homework:Time Management:Allocating specific times for homework can help students stay organized and avoid lastminute rushes.Study Groups:Collaborating with classmates can provide different perspectives and make the learning process more enjoyable.Asking for Help:If students are struggling with their homework,its important to seek help from teachers,tutors,or peers.5.Technology in Homework:With the advent of digital tools,homework can now be completed using computers,tablets,or smartphones.Online platforms and educational apps provide interactive ways to complete assignments and receive feedback.6.Homework and Learning Outcomes:Research has shown that homework can improve academic performance when assigned in a thoughtful and balanced manner.However,excessive amounts of homework can lead to stress and burnout.7.Homework Challenges:Procrastination:Many students struggle with putting off homework until the last minute,which can lead to poor quality work and increased stress.Distractions:The presence of digital devices and social media can make it difficult for students to focus on their assignments.WorkLife Balance:Students need to find a balance between schoolwork and other activities to maintain their mental health and wellbeing.8.Feedback on Homework:Teachers provide feedback on completed homework to help students understand their mistakes and learn from them.This feedback is crucial for academic growth and improvement.9.Homework Policies:Schools and educational institutions may have specific policies regarding homework,including the amount of homework assigned,the frequency of assignments,and the consequences for not completing homework.10.Cultural Perspectives on Homework:Attitudes towards homework vary across cultures.In some countries,homework is considered a necessary part of education,while in others,it may be seen as an unnecessary burden on students.Homework is a critical component of the educational experience,but it must be approached with a balance that respects students time and wellbeing.It should be a tool for learning and growth,rather than a source of undue stress.。
wiley's best practice seo tips
wiley's best practice seo tipsWiley's Best Practice SEO TipsSearch Engine Optimization (SEO) is an essential aspect of digital marketing that helps businesses improve their online visibility and drive organic traffic to their websites. Wiley, one of the leading global providers of educational materials and solutions, has developed a set of best practice SEO tips that can help businesses succeed in the digital landscape. In this article, we will delve into Wiley's best practice SEO tips, step by step, to provide you with a comprehensive understanding of how to optimize your website and improve your search engine rankings.Step 1: Perform Keyword ResearchKeyword research is the foundation of any successful SEO strategy. It involves identifying the search terms that potential customers are using to find products or services related to your business. Start by brainstorming a list of relevant keywords and then use keyword research tools, such as Google Keyword Planner or SEMrush, to expand your list and determine the search volume and competitiveness of each keyword. Focus on long-tail keywords, asthey are more specific and have a higher chance of driving targeted traffic to your website.Step 2: Optimize On-Page ElementsOn-page optimization refers to optimizing various elements on your website to make it more search engine and user-friendly. Here are some key on-page elements to focus on:a) Title Tags: Include targeted keywords in your title tags to improve relevancy in search engine results.b) Meta Descriptions: Write compelling meta descriptions that accurately summarize the content on each page and entice users to click through to your website.c) Header Tags: Use header tags (H1, H2, H3...) to organically incorporate keywords into your content and improve readability.d) URL Structure: Create clean and concise URLs that include relevant keywords to make it easier for search engines to understand the content of your web pages.e) Image Optimization: Optimize your images by compressing them for faster loading times and include descriptive alt tags that include relevant keywords.Step 3: Create High-Quality and Engaging ContentContent is king when it comes to SEO. Search engines valuehigh-quality and relevant content that provides value to users. Here are some tips to create engaging content:a) Keyword Incorporation: Naturally incorporate your target keywords into your content but avoid keyword stuffing.b) Readability: Write for your audience, not just search engines. Use clear and concise language, break up content with headings and bullet points, and ensure proper grammar and spelling.c) Multimedia Integration: Incorporate images, videos, and infographics into your content to make it more visually appealing and engaging.d) Freshness: Regularly update your content to keep it relevant and demonstrate to search engines that your website is actively providing valuable information.Step 4: Build Quality BacklinksBacklinks are an essential off-page SEO factor that helps search engines determine the authority and credibility of your website. Here's how to build quality backlinks:a) Guest Blogging: Write engaging and informative guest posts for relevant and authoritative websites in your industry, including a link back to your website in the author bio or within the content.b) Social Media Engagement: Share your content on social media platforms to increase its visibility and encourage others to link to it.c) Influencer Marketing: Collaborate with influential individuals or brands in your industry to gain exposure and earn quality backlinks.d) Monitor and Remove Toxic Backlinks: Regularly analyze yourbacklink profile using tools like Google Search Console or Moz, and disavow or remove any low-quality or spammy backlinks that may harm your website's ranking.Step 5: Optimize for Mobile DevicesWith an increasing number of people accessing the internet through their mobile devices, optimizing for mobile has become crucial for SEO success. Ensure your website is mobile-friendly, as this can significantly impact your search engine rankings. Here are some tips for mobile optimization:a) Responsive Design: Create a responsive website design that adapts to different screen sizes and resolutions.b) Page Speed: Optimize your website's loading speed for mobile devices by compressing images and using caching techniques.c) Mobile Usability: Ensure that your website is user-friendly on mobile devices by making navigation easy, text readable, and buttons clickable.Step 6: Monitor, Analyze, and Continuously ImproveSEO is an ongoing process, and it's essential to monitor your website's performance, analyze data, and continuously make improvements based on your findings. Here are important steps in this process:a) Implement Analytics: Install Google Analytics or other website analytics tools to track and analyze key performance metrics, such as organic traffic, bounce rate, and conversion rate.b) Keyword Ranking: Regularly track your website's keyword rankings and make adjustments to your SEO strategy based on your rankings' performance.c) User Engagement: Analyze user behavior on your website, such as time spent on page, click-through rates, and conversion rates, to gain insights into user engagement and make necessary improvements.d) SEO Audits: Conduct regular SEO audits to identify areas of improvement, such as broken links, duplicate content, or technicalissues, and fix them promptly.In conclusion, Wiley's best practice SEO tips provide a comprehensive roadmap to improve your website's search engine rankings. Remember to perform keyword research, optimizeon-page elements, create high-quality content, build quality backlinks, optimize for mobile devices, and monitor and continuously improve your SEO efforts. By following these best practices, you can enhance your online visibility, increase organic traffic, and ultimately drive business success.。
智能反射面增强的多无人机辅助语义通信资源优化
doi:10.3969/j.issn.1003-3114.2024.02.018引用格式:王浩博,吴伟,周福辉,等.智能反射面增强的多无人机辅助语义通信资源优化[J].无线电通信技术,2024,50(2): 366-372.[WANG Haobo,WU Wei,ZHOU Fuhui,et al.Optimization of Resource Allocation for Intelligent Reflecting Surface-enhanced Multi-UAV Assisted Semantic Communication[J].Radio Communications Technology,2024,50(2):366-372.]智能反射面增强的多无人机辅助语义通信资源优化王浩博1,吴㊀伟1,2∗,周福辉2,胡㊀冰3,田㊀峰1(1.南京邮电大学通信与信息工程学院,江苏南京210003;2.南京航空航天大学电子信息工程学院,江苏南京211106;3.南京邮电大学现代邮政学院,江苏南京210003)摘㊀要:无人机(Unmanned Aerial Vehicle,UAV)为无线通信系统提供了具有高成本效益的解决方案㊂进一步地,提出了一种新颖的智能反射面(Intelligent Reflecting Surface,IRS)增强多UAV语义通信系统㊂该系统包括配备IRS的UAV㊁移动边缘计算(Mobile Edge Computing,MEC)服务器和具有数据收集与局部语义特征提取功能的UAV㊂通过IRS 优化信号反射显著改善了UAV与MEC服务器的通信质量㊂所构建的问题涉及多UAV轨迹㊁IRS反射系数和语义符号数量联合优化,以最大限度地减少传输延迟㊂为解决该非凸优化问题,本文引入了深度强化学习(Deep Reinforce Learn-ing,DRL)算法,包括对偶双深度Q网络(Dueling Double Deep Q Network,D3QN)用于解决离散动作空间问题,如UAV轨迹优化和语义符号数量优化;深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)用于解决连续动作空间问题,如IRS反射系数优化,以实现高效决策㊂仿真结果表明,与各个基准方案相比,提出的智能优化方案性能均有所提升,特别是在发射功率较小的情况下,且对于功率的变化,所提出的智能优化方案展示了良好的稳定性㊂关键词:无人机网络;智能反射面;语义通信;资源分配中图分类号:TN925㊀㊀㊀文献标志码:A㊀㊀㊀开放科学(资源服务)标识码(OSID):文章编号:1003-3114(2024)02-0366-07Optimization of Resource Allocation for Intelligent ReflectingSurface-enhanced Multi-UAV Assisted Semantic CommunicationWANG Haobo1,WU Wei1,2∗,ZHOU Fuhui2,HU Bing3,TIAN Feng1(1.School of Communications and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing210003,China;2.College of Electronic and Information Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing211106,China;3.School of Modern Posts,Nanjing University of Posts and Telecommunications,Nanjing210003,China)Abstract:Unmanned Aerial Vehicles(UAV)present a cost-effective solution for wireless communication systems.This article introduces a novel Intelligent Reflecting Surface(IRS)to augment the semantic communication system among multiple UAVs.The system encompasses UAV equipped with IRS,Mobile Edge Computing(MEC)servers,and UAV featuring data collection and local semantic feature extraction functions.Optimizing signal reflection through IRS significantly enhances communication quality between drones and MEC servers.The formulated problem entails joint optimization of multiple drone trajectories,IRS reflection coefficients,and the number of semantic symbols to minimize transmission delays.To address this non-convex optimization problem,this paper introduces a Deep收稿日期:2023-12-31基金项目:国家重点研发计划(2020YFB1807602);国家自然科学基金(62271267);广东省促进经济发展专项资金(粤自然资合[2023]24号);国家自然科学基金(青年项目)(62302237)Foundation Item:National K&D Program of China(2020YFB1807602);National Natural Science Foundation of China(62271267);Key Program of Marine Economy Development Special Foundation of Department of Natural Resources of Guangdong Province(GDNRC[2023]24);National Natural Sci-ence Foundation of China(Young Scientists Fund)(62302237)ReinforcementLearning(DRL)algorithm.Specifically,theDuelingDoubleDeepQNetwork(D3QN)isemployedtoaddressdiscreteactionspaceproblemssuchasdronetrajectoryandsemanticsymbolquantityoptimization.Additionally,DeepDeterministicPolicyGra dient(DDPG)algorithmisutilizedtosolvecontinuousactionspaceproblems,suchasIRSreflectioncoefficientoptimization,enablingefficientdecision making.Simulationresultsdemonstratethattheproposedintelligentoptimizationschemeoutperformsvariousbenchmarkschemes,particularlyinscenarioswithlowtransmissionpower.Furthermore,theintelligentoptimizationschemeproposedinthispaperexhibitsrobuststabilityinresponsetopowerchanges.Keywords:UAVnetwork;IRS;semanticcommunication;resourceallocation0 引言当前技术飞速发展的背景下,无人机(UnmannedAerialVehicle,UAV)已经成为无线通信系统中一种重要的技术[1]。
科研岗位招聘笔试题及解答(某世界500强集团)
招聘科研岗位笔试题及解答(某世界500强集团)(答案在后面)一、单项选择题(本大题有10小题,每小题2分,共20分)1、以下哪种算法是非监督学习的一种典型应用?A、决策树B、线性回归C、K-means聚类D、逻辑回归2、以下哪一项不是科研项目管理中的关键要素?A、项目的时间管理B、预算的制定与控制C、团队协作与人员管理D、营销策略3、在模型训练过程中,过拟合的现象通常发生在:A、训练初期B、训练中期C、训练后期D、训练结束时4、关于深度学习中的反向传播算法,下列描述正确的是:A、反向传播算法仅适用于浅层网络B、反向传播算法是用来优化模型参数的基本算法C、反向传播算法是用来正向传播信号的基本算法D、反向传播算法无法与梯度下降法结合使用5、科研项目管理的核心是什么?A、技术开发效率B、团队协作能力C、项目目标达成D、创新思维能力6、在实验设计中,什么是确保研究结果可重复性的关键?A、采取随机抽样B、使用复杂实验设备C、严格的实验操作规程D、确保数据收集的全面性7、在团队项目中,哪种沟通方式能够确保信息得到准确传递和理解?A、电子邮件B、口头报告C、面对面会议D、即时消息8、科学研究中,对于实验数据的处理和分析,哪种统计方法能够用于检测两组数据是否存在显著差异?A、卡方检验B、T检验C、方差分析D、回归分析9、在材料科学中,以下哪种材料被广泛用于电子元件中的绝缘层和防腐蚀保护?(A)铝 (B) 玻璃 (C) 聚四氟乙烯 (D) 钢 10、半导体材料在电子学中起着决定性作用,以下哪种半导体材料在其价带和导带之间具有最大的能量隙?(B)砷化镓 (B) 硅 (C) 锗 (D) 碳二、多项选择题(本大题有10小题,每小题4分,共40分)1、科研岗位员工在进行项目设计时,应遵循的原则有哪些?A. 创新性B. 科学性C. 可行性D. 经济性E. 规范性2、科研人员进行学术论文写作时,应注意以下哪些方面?A. 明确研究目的和意义B. 深入研究背景和现状C. 展示实验设计与方法D. 论述结果分析与讨论E. 清晰引文引用标注3、(多项选择题)在进行实验数据处理时,常用的统计方法包括哪些?A. 方差分析B. 偏差计算C. 回归分析D. 相关性分析E. 方差计算4、(多项选择题)以下哪些技术被广泛应用于现代科学研究中?A. 基因编辑技术B. 3D打印技术C. 云计算D. 物联网技术E. 深度学习5、在机器学习领域,以下哪些算法属于无监督学习?( ) A) k-means聚类B) 决策树 C) 支持向量机 D) 随机森林 E) 线性回归 F) 主成分分析6、在深度学习中,常用的卷积神经网络(CNN)结构有哪些常见的架构?( ) A) LeNet B) AlexNet C) VGG D) Inception E) LSTM F) Transformer7、以下关于科研项目管理的说法中,哪些是正确的?()(2分)A、科研项目管理主要强调的是项目进度的控制。
15个免费学术搜索引擎
15个免费学术搜索引擎Jokes于 2009-10-9,00:49 评论 (2)学术搜索是一项免费服务,可以帮助快速寻找学术资料,如专家评审文献、论文、书籍、预印本、摘要以及技术报告。
本文为你介绍15个学术搜索引擎。
1 . Google ScholarGoogle 推出的免费学术搜索工具,可以帮助用户快速查找学术资料,包括来自学术著作出版商、专业性社团、预印本、各大学及其他学术组织的经同行评论的文章、论文、图书、摘要和技术报告。
2. SCIRUSSCIRUS 是由爱思唯尔科学公司Elsevier Science 于2001 年4 月推出的迄今为止国际互联网上最全面的科技信息专用搜索引擎。
它以自身拥有的资源为主体,对网上具有科学价值的资源进行整合,集聚了带有科学内容的网站及与科学相关的网页上的科学论文、科技报告、会议论文、专业文献、预印本等。
其目的是力求在科学领域内做到对信息全面深入的收集,以统一的检索模式面向用户提供检索服务。
3. ResearchIndexResearchIndex 又名CiteSeer ,是NEC 研究院在自动引文索引Autonomous Citation Indexing ,ACI 机制基础上建设的一个学术论文数字图书馆,它提供了一种通过引文链接检索文献的方式,目标是从多个方面促进学术文献的传播与反馈。
ResearchIndex 检索互联网上Postscript 和PDF 文件格式的学术论文。
目前在其数据库中可检索到超过500000 篇论文。
主要涉及计算机科学领域,涉及的主题包括互联网分析与检索、数字图书馆与引文索引、机器学习、神经网络、语音识别、人脸识别、元搜索引擎、音频/ 音乐等。
ResearchIndex 在网上提供完全免费的服务包括下载PS 或PDF 格式的全文,系统已实现全天24 小时实时更新。
4. INFOMINEINFOMINE 是为大学教师、学生和研究人员建立的网络学术资源虚拟图书馆。
PageRank_Pro_一种改进的网页排序算法_李凯
PageRank -Pro一种改进的网页排序算法李 凯,赫枫龄,左万利(吉林大学计算机科学与技术学院,长春130012)提要:根据网页点击信息改进了原有的Pag eRa nk 算法,利用Seidel 迭代算法加快了迭代收敛过程.实验结果表明,改进后的迭代算法效率提高23%左右.关键词:Page Rank ;Seidel 迭代;用户点击次数;搜索引擎中图分类号:TP311 文献标识码:A 文章编号:1671-5489(2003)02-0175-05收稿日期:2002-07-12.作者简介:李 凯(1977~),男,硕士研究生,从事W eb 挖掘与网络搜索引擎研究.联系人:左万利(1957~),男,博士,教授,从事W eb 挖掘与网络搜索引擎研究,E -mail :w anli @mail .jlu .edu .cn .基金项目:吉林省科技发展计划项目基金(批准号:20000111).1 引 言互联网的规模一直在高速增长.1994年,最早的搜索引擎Wo rld Wide Web W orm 标引了11万网页,到1997年,当时的搜索引擎所标引的网页已达2~100M,2000年可标引的网页已超过10亿[1],而且,今天仍然以每天超过100万的速度在增长.第一代网络搜索引擎比较注重查全率,而第二代搜索引擎更加注重检索的准确性和相关性.应运而生的是以Pag eRank (一种由页面的父页面的重要性来决定其重要性的迭代算法)为代表的基于网页链接分析的对关键词匹配搜索结果进行排序处理的技术.Pag eRa nk 技术在著名的Go ogle 搜索引擎中被成功的应用,使得Goo gle 的搜索精度大大超过了以前的搜索引擎.目前,基于网页链接分析对关键词匹配搜索结果进行处理的算法主要有3种[2]:(1)Page Rank 提取网页的链接信息,进行离线计算(这使得它比H its 有更高的效率).(2)Hits 将网页分为Hub (有许多链接指向其它页面)和Autho rity (被许多外部链接所指向)两类(交集可不为空),通过a n +1(u )=∑(v ,u )∈Gh n (v ),h n +1(v )=∑(v ,u )∈Ga n (u )迭代,以最终的a uthority 值为依据对结果进行排序,式中G 表示网络子图.(3)Av erag e 和Sim ,严格地讲,该方法不是完全基于链接分析的,它结合了相似性量度和链接分析.页面p 的重要性定义为所有指向它链接的相似性平均值(av erag e ):autho rity (p )=1|{q |q →p }|∑q →psimilarity(q ),页面p 的重要性定义为p 的相似性值加上所有指向它链接的相似性平均值:autho rity (p )=similarity(p )+1|{q |q →p }|∑q →psimilarity (q ). 以上3种方法都需要进行迭代计算,并且都没有使用互联网的动态信息;但是,只有Page Rank 是完全基于链接分析的(H its 的初始页面集合是文本搜索的结果,计算集合是由此扩展而来的).本文主要对Page Rank 的迭代效率和互联网动态信息(页面点击次数)的使用进行讨论.Vol.41 吉林大学学报(理学版) No.2 2003年4月 JOURN AL OF JILIN UN IV ERS ITY (SCIENCE EDI TION)175~179DOI:10.13413/ k i .jd xb l xb.2003.02.0142 PageRank 技术2.1 PageRank 的概念Pag e Rank 是著名搜索引擎Goog le 引入的网页排序算法,该排序根据网页间链接信息迭代计算得到,这里的链接信息是相对静态的,没有考虑网页使用的动态信息.本文网络仍被看作一个有向图:G =(V ,E ),其中V 是节点(网页)集,E 是边(当且仅当存在从页面i 到页面j 的链接时存在从节点i 到节点j 的边)集.Pag e Rank 的基本思想在于一个页面重要或者有链接指向它的页面多,或者有链接指向它的页面重要或者二者兼而有之.其初始定义如下:PR (q )=∑(p ,q )∈EP R (p )N p ,其中N p 表示节点p 的出度.在计算Page Ra nk 时,一般把它看作一个求矩阵特征向量的过程:M 表示G 的过渡矩阵,如果存在节点j 到节点i 的边,则置矩阵中元素m ij 的值为1/N j ,否则置为0.这样,最终的结果满足:x =Mx ,其中x 表示各页面的Page Rank 构成的向量.由M 的构成可知,矩阵M 的最大特征值为1,x 为1对应的特征向量.这样,可以用简单迭代法对上式进行求解.要保证上述迭代过程的收敛,M 必须满足两个条件:一是M 必须是不可约的(G 强联通);二是M 必须是非循环的.后者可由网络结构保证,前者可以通过在迭代过程中加一个潮湿因子予以保证[3].定义M ′=c M +(1-c )1NN ×N,用M ′代替M 进行计算,相当于在G 的每两个节点间增加了两条边,这样做的同时也解决了所谓的Rank Sink 问题.此时迭代形式如下:PR i +1=c M ×PR i +(1-c )×1NN ×1,这样,在保证迭代收敛的同时,也使Pag eRank 的定义变为:设页面T 1,T 2,…,T n 有链接指向页面A :P R (A )=(1-c )+cP R (T 1)N T 1+…+PR (T n )N T n,此时,Pag eRank 的定义符合随机冲浪模型[4].2.2 PageRank 的发展在Pag eRank 产生尤其是在Go ogle 中成功的应用之后,人们对算法本身及相关内容进行了进一步的研究,其中具有代表性的如网络链接信息的组织和存取[5],通过矩阵分块来提高Pag eRa nk 计算效率[3]以及个性化的Pag e Rank ,其出发点在于提高那些大众化的搜索结果质量,方法是由每一主题的专家页面与特定页面的链接结构决定其重要性,如Hillto p 系统[6],对不同主题使用不同的个性化向量,如主题敏感的Pag eRank [7],以及针对不同用户群使用不同的个性化向量等.一些研究工作对Pag eRank 进行了扩展,如结合特定的查询词考虑页面的向外链接的差异性,对随机冲浪模型进行修改,从而得到Directed Surf 模型[8];用处理不确定性的方法处理页面链接信息[9,10]等.3 PageRank 算法的改进3.1 在PageRank 算法中使用点击信息Pag e Rank 算法(以及其它的基于链接分析的网页处理技术)在计算时所使用的信息仅限于链接结176 吉林大学学报(理学版)V ol.41 构(或者还有文本内容),这些信息更新的周期较长,一般需要3~4周左右(由spider 程序的效率及搜索引擎的规模决定).忽略了互联网上每时每刻都在变化的大量的动态信息,这些信息的捕捉和利用十分困难.但是,如果成功地利用这些信息,必将获得很大的收益.除了个性化搜索,在一般的(通用)搜索引擎中,用户的参与也很重要.在我们的系统中,将用户的选择即搜索引擎用户的对url 的每一次点击就是对相应网页的一次选择作为评价网页重要性的一个因素.在用户使用搜索引擎时,其每次点击url 的动作都被引擎服务器记录下来,用户IP(及其它用户信息)与相应的页面id 一起保存,这样,在下次计算Pag eRank 时,就得到了一个新的向量(点击向量),其每一个分量是对应页面的点击次数与所有页面点击次数之和的比,记为b ,称为全局个性化向量,因为它代表了所有用户对页面选择的合力.在迭代过程中,b 作为页面重要性的一部分,迭代公式修正为:PR i +1=c M ×PR i +(1-c )×1NN ×1+b ,把上式看作对方程x i +1=Hx i +b的求解,其收敛性不变[11];但是如果b 的某些分量值较大时,所需的迭代次数将增加.应该注意到,上式不能写成PR i +1=M ×PR i +b ,因为可能存在如下情况:b i =b j =b k =0, m ij ≠0, m jk ≠0, m ki ≠0,此时即出现了Rank Sink 问题.集体个性化向量的来源有两部分:一是服务器保存的用户点击信息;二是对网络上的新页面,其点击次数为0,此时我们给这样的页面一个随机的点击次数,这样做的同时,也给新的页面一个被用户接受的机会.应该强调的是,就象可以通过设置html 文档的标题来影响早期搜索引擎的结果[6]和通过设置网页链接来影响Goog le 的结果[12]一样,网站设计者也可以通过点击自己的网页来影响全局个性化Pag e Rank 的结果(尽管我们已经采取措施减弱这种影响).算法取得最好的结果当然是在所有用户都正常使用的情况下.3.2 加速PageRank 迭代收敛的Seidel 技巧对于迭代解出的过渡矩阵M 的特征向量,要用它对关键词匹配的搜索结果进行排序,显然我们关心的不是它的大小,而是它的方向(甚至只是其各分量的大小顺序).由此,可以在不影响M 特征向量的方向的前提下对M 进行适当的变换以加快迭代的收敛速度.简单迭代法的分量形式为xk i +1=∑i -1j =1mk jx j i+∑nj =i mk jx j i , k =1,2,…,n . 显然,在计算xk i +1时,x 1i +1,…,xk -1i +1都已经求出,一般地,后计算出的结果更接近最终结果.因此可以用这些新值来计算x k i +1,于是迭代形式变为xk i +1=∑i -1j =1mkjxj i +1+∑nj =imkjx j i , k =1,2,…,n .(3.1)这就是Seidel 技巧(计算过程中总是利用最新算出来的值,也称Seidel 迭代法)[8].Seidel 迭代法收敛的一个充分条件是‖H ‖≤1,其中H 为任一n 阶矩阵[12].由于‖M ‖∞=1,若直接使用Seidel 技巧则不能保证迭代收敛;同时,注意到特征值和特征向量的定义:Hx =λx ,令H ′=e H ,其中0<e <1,由矩阵特征值的性质可知H ′存在对应的特征值e λ,它对应的特征向量仍177 N o.2 李 凯等:Pag e Rank-Pro 一种改进的网页排序算法 为x .由此,可以对H ′使用Seidel 迭代法求H 的特征向量x .此外,迭代过程相当于H ′n x =e n H n x ,其中n 为迭代次数,实际求得的是e nx ,其方向与x 一致,各分量的值都相应的缩小了,所以计算时应选择适当的e ,以避免迭代过程中向下溢出.4 PageRank -Pro 算法描述输入:网络对应图的转移矩阵,用户点击表.输出:页面排序的Page Rank 值.步骤:(1)在保存点击信息时进行,根据返回给用户的页面所处位置的不同(处于第几页),给相应的点击加权.(2)对每个页面来自同一IP 的点击重新计数,方法是:第i 次点击作1/i 处理.(3)计算每个页面的点击次数,方法是:每个点击的次数乘以其权值求和,对于新的页面给一随机值;同时计算总的点击次数.(4)用每个页面的点击次数除以总点击次数,算出全局个性化向量.(5)按(3.1)式计算P R 每个份量,直至收敛.算法说明:(1)在搜索引擎返回给用户的结果中,页面的位置越靠前,它被点击的可能性就越大,显然返回结果第1页和第10页其被点击的意义不同.据此,我们将权值列于表1.Table 1 Page weight tableReturned pag e #Page po sitio n Clicking po ssibility (%)W eigh t 11~10 47.31.0211~2012.21.2321~307.42.0431~40 5.03.0541~50 3.75.0≥6≥5124.48.0 (2)页面的所有者可以自己点击其页面来影响b ,一个用户可能由于某种原因(比如误操作)连续点击一个页面几次,为减弱这种影响,我们对来自同一IP 的对同一页面点击的计数进行了处理.(3)互联网每天新增的网页多达100万,这样搜索引擎每次更新都会标引许多新的页面,其中必然存在部分优秀页面由于其链接位置不佳而使其Page Ra nk 值较低,不能引起用户的注意,所以对新的页面赋以随机值,以给它为用户所知的机会.5 实验结果分析表2给出了一组Seidel 迭代法的实验结果,表中n 代表迭代次数,应用简单迭代法的迭代次数为11.表3给出了不同e 值对迭代结果的影响,这里d 表示较简单迭代法结果偏离的分量数.Table 2 A group of Seidel iteration resultse0.20.30.40.50.60.70.80.90.95n 91113161924304047e 0.9750.97750.9800.9900.9990.99990.999950.999991.0n5050504937231998Table 3 Inf luence of σon iterat ion results e 0.80.70.60.50.40.30.2d224 可见,选择适当的e 可以减少迭代次数,这样的e 存在于两个区间,一是接近于0,二是非常接近1时.对于第一个区间,随e 值的减小,结果的精度相应降低,x 的方向发生了变化,在没有采取其它178 吉林大学学报(理学版)V ol.41 措施的情况下,置e 于这一区间并不理想;e 位于第二个区间时,对结果的精度没有任何影响,同时也大幅度地减少了迭代的次数(23.2%),不足之处只在于这个区间太小(0.99999,1),但由于所需要的仅是一个e 值,所以可以选择这个区间.综上可见,本文对Page Ra nk 算法的迭代效率进行了讨论,提出互联网动态信息的网页被点击次数作为集体个性化向量应用于Page Rank 算法中,以提高搜索引擎查询结果的精度.同时,提出以Seidel 迭代法代替通常使用的简单迭代法,提高了计算速度;实验结果表明,上述两项改进是有效的.以上算法已被应用于我们正在研制的新一代网络搜索引擎Chinav iv i 中.参考文献[1] Serg ey B ,Law r ence P .The Anato my of a Lar ge -scale Hyper tex tual Web Sear ch Engine [J ].WWW 7/ComputerN et work s ,1998,30(1-7):107~117.[2] Juleen G,Stefan M R.Link-based Appro aches fo r Tex t Retrieva l [C ].Pro ceeding s o f T REC-10.N IST ,Gaithersburg ,M D :2001,13~16.[3] Taher H .Hav eliw ala.Efficient Co mputa tio n o f Pag e Rank [R ].L A :Stanfor d U niv er sity ,1999.31.[4] Se rgey B,Ra jeev M ,La rr y P,et al .Wha t Can Yo u do with a W eb in Your Pocket [J].Data EngineeringBulletin ,1998,21(2):37~47.[5] Krshna B,Andrei B,M o nika H,et al .The Connec tiv ity Serv er:Fast Access to Linkag e Info rma tion o n the Web[J].W WW 7/Computer N etwork s ,1998,30(1-7),469~477.[6] Krish na B,Geo rg e A,M ihaila.W hen Ex perts Ag r ee :U sing N on-Affilia ted Ex per ts to Rank Popula r To pics [J ].ACM Transactions on Information Systems ,2002,20(1):47~58.[7] Taher H ,Hav eliw ala .T opic -Sensitiv e Pag e Ra nk [R ].L A :Stanfo rd U niv er sity ,2002.12.[8] M at thew R ,Pedr o D .The Intellig ent Surf :Pro ba bilistic Co mbina tio n o f Link a nd Co ntent Info r ma tio n inPag e Rank [C ].In :Dietterich T G ,Becke r S ,Ghahra ma ni Z ,eds .Adva nces in Neura l Info r ma tio n Pr ocessingSystems 14.Cam bridge ,M A :M I T Press,2002.[9] David C,Tho mas H .The M issing Link-A Pr obabilistic M odel of Document Content and Hyper tex t Co nnectiv ity[J].N eural Information Processing Sy stems ,2000,13:430~436.[10] Justin P.M odeling and Co mbining Ev idence Pro vided by Do cument Rela tio nships U sing Pr obabilisticArg umenta tio n Sy stem [C].Proceeding s o f ACM -SIGI R.M elbo urne,Austra lia:The M I T Press,1998.182~189.[11] Fe ng Guo -chen(冯果忱),Liu Jing -lun(刘经纶).Funda mentals of N umerical Alg ebra (数值代数基础)[M ].Cha ng ch un(长春):J ilin U niv er sity Press(吉林大学出版社),1991.132~145.[12] Ch ris Riding s.Page Rank Ex plained [EB].http ://search.eng ine-submissio /.2001.PageRank -Pro An Improved Page Rank AlgorithmLI Kai,HE Feng-ling ,ZUO Wan-li(College of Computer Science and Technology ,J ilin University ,Changchun 130012,China )Abstract :Pag eRank is a w eb page ra nking algo rithm pro posed by Goo gle,a w ell know n searchengine.The algo rithm is an itera tiv e process that determines w eb page ranking based o n page link structure ,o r co -cita tion .Pag eRank is a successful ,but not a perfect alg orithm .For instance ,a heavily linked web pag e migh tn 't be so impor tant if it has few visitors.We first integ rated pag e click info rmation with Pag eRank calcula tion,and then em ploy ed Seidel 's method to speed up theconv ergence o f the itera tion process .Ex perim ental results show that about 23%perfo rmance im prov em ent is achiev ed w ith our im prov ed alg o rithm.Keywords :Pag e Rank;Seidel itera tion;user click frequency;search engine(责任编辑:赵立芹)179 N o.2 李 凯等:Pag e Rank-Pro 一种改进的网页排序算法 。
基于内容的视频检索
1
主要内容
问题旳引入 国内外研究现状 基于内容旳视频检索简介 视频构造旳分析 关键技术 视频检索和浏览 目前研究中存在旳问题及将来旳发展趋势
2
一、问题旳引入
近年来,数字视频信息出现了飞速膨胀, 新旳视频应用,如数字图书馆、视频点 播、数字电视等,已经为越来越多旳人 所接受和熟悉。
在运动量取局部最小值处选用关键帧, 它反应了视频数据中旳一种“静止”特 点,视频中经过摄像机在一种新旳位置 上停留或经过人物旳某一运动旳短暂停 留来强调其主要性。 光流 光流场
40
首先经过Horn-Schunck法计算光流,对 每个像素光流分量旳模求和,作为第k 帧旳运动量M(k),即
其中 Ox(i,j,k)是k帧内(i ,j)像素光 流旳X分量,Oy(i,j,k)是k帧内像素(i,j) 光流旳Y分量。
44
颜色特征
颜色是图像最明显旳特征,与其他特征 相比,颜色特征计算简朴、性质稳定, 对于旋转、平移、尺度变化都不敏感, 体现出很强旳鲁棒性。
颜色特征涉及颜色直方图、主要颜色、 平均亮度等。
45
其中利用主要颜色和平均亮度进行图像 旳相同匹配是很粗略旳,但是它们能够 作为层次检索措施旳粗查,对粗查旳成 果再利用子块划分旳颜色直方图匹配进 行进一步旳细查。
8
三、基于内容旳视频检索简介
我们需要研究旳是,信息检索系统怎样 适本地表达用户所要求旳内容,并在视 频数据库中找出符合这个查询要求旳信 息返回给用户。
Content-Based Video Retrieval,CBVR 根据视频旳内容和上下文关系,对大规
模视频数据库中旳视频数据进行检索 提供这么一种算法:在没有人工参加旳
9
目前,基于内容旳视频检索研究,除了 辨认和描述图像旳颜色、纹理、形状和 空间关系外,主要旳研究集中在视频分 割、特征提取和描述(涉及视觉特征、 颜色、纹理和形状及运动信息和对象信 息等)、关键帧提取和构造分析等方面
稀疏检索和 rerank 模型
稀疏检索和 rerank 模型是信息检索领域中常用的两种模型,它们能够有效地提高搜索引擎的检索效率和准确性。
本文将对稀疏检索和rerank 模型进行详细的介绍和分析,以帮助读者更好地理解和应用这两种模型。
一、稀疏检索模型1.1 稀疏检索模型的概念稀疏检索模型是一种通过计算查询与文档之间的相似度来进行信息检索的模型。
它通常使用向量空间模型或者词袋模型来表示文档和查询,然后通过计算它们之间的相似度来确定检索结果的相关性。
1.2 稀疏检索模型的优点稀疏检索模型的优点在于其简单直观、易于实现和扩展。
它能够较好地处理大规模的文档集合,并且具有较高的检索效率。
1.3 稀疏检索模型的局限性稀疏检索模型的局限性在于对文档和查询的表示方法较为简单,无法很好地表达文档和查询之间的语义相似性。
它在处理一些复杂的信息检索任务时表现不佳。
二、rerank 模型2.1 rerank 模型的概念rerank 模型是一种在传统检索结果的基础上进行二次排序的模型。
它通常使用机器学习算法来重新对检索结果进行排序,以提高检索结果的质量和相关性。
2.2 rerank 模型的优点rerank 模型的优点在于能够充分利用机器学习算法来对检索结果进行优化,提高检索结果的质量和相关性。
它能够较好地处理一些复杂的信息检索任务,如多义词消歧和相关性反馈等。
2.3 rerank 模型的局限性rerank 模型的局限性在于其对机器学习算法的依赖较高,需要大量的标注数据和计算资源。
在实际应用中需要权衡资源投入和效果提升的效率。
三、稀疏检索与 rerank 模型的结合3.1 稀疏检索与 rerank 模型的结合方式稀疏检索与 rerank 模型可以通过多种方式进行结合,如利用rerank 模型对稀疏检索结果进行优化、将 rerank 模型的输出作为稀疏检索模型的一部分等。
3.2 稀疏检索与 rerank 模型的优势稀疏检索与 rerank 模型的结合能够充分利用两种模型的优势,提高检索效率和准确性。
16个学术搜索引擎
学术搜索是一项免费服务,可以帮助快速寻找学术资料,如专家评审文献、论文、书籍、预印本、摘要以及技术报告。
本文为你介绍16个学术搜索引擎。
其实就某一专业或领域而言,一般用到两三个搜索引擎就够了,往往是学校购买全文的。
就我个人而言,一般常用英文的ISI Web of Knowledge,google s cholar,中文的CNKI和万方,中文学位论文用万方学位搜索。
部分学校的学位论文外网是不提供下载的,这时人人的优势就出来了,找个该校的童鞋就搞定了。
1 . Google ScholarGoogle 推出的免费学术搜索工具,可以帮助用户快速查找学术资料,包括来自学术著作出版商、专业性社团、预印本、各大学及其他学术组织的经同行评论的文章、论文、图书、摘要和技术报告。
2. SciVerse从2010年8月28日起,ScienceDirect、Scopus 以及 Scirus 的特定网页内容已整合到一个称为 SciVerse 的平台中。
那么,SciVerse是什么呢?概括起来讲,就是海量科研信息的一个一站式集散平台。
这个平台,将帮助科研人员实现“少量搜索,更多信息”,而且,这些信息都是与科研相关的信息。
就像文章题目所显示的那样,SciVerse整合了包括SD、Scopus以及Scirus的信息,形成一个“SciVerse Hub”(SciVerse中心)。
除此之外,SD用户以及Scopus用户依然可以享受之前的服务,它们将与“SciVerse Hub”一起形成SciVerse的三大部分“SciVerse ScienceDirect”、“SciVerse Scopus”、“SciVerse Hub”,如下图所示:3. web of science/web of knowledgeWeb of Science是美国Thomson Scientific(汤姆森科技信息集团)基于WE B开发的产品,是大型综合性、多学科、核心期刊引文索引数据库,包括三大引文数据库(科学引文索引(Science Citation Index,简称SCI)、社会科学引文索引(Social Sciences Citation Index,简称SSCI)和艺术与人文科学引文索引(Arts & Humanities Citation Index,简称A&HCI))和两个化学信息事实型数据库(Current Chemical Reactions,简称CCR和Index Chemicus,简称IC),以及科学引文检索扩展版(Science Ciation Index Expanded,SCIE)、科技会议文献引文索引(Conference Proceedings Citation Idex-Science,CPCI-S)和社会科学以及人文科学会议文献引文索引(Conference Proceedings Citation index-Social Science&Humanalities,CPCI-SSH)三个引文数据库,以ISI Web of Knowledge作为检索平台。
基于周期采样的分布式动态事件触发优化算法
第38卷第3期2024年5月山东理工大学学报(自然科学版)Journal of Shandong University of Technology(Natural Science Edition)Vol.38No.3May 2024收稿日期:20230323基金项目:江苏省自然科学基金项目(BK20200824)第一作者:夏伦超,男,20211249098@;通信作者:赵中原,男,zhaozhongyuan@文章编号:1672-6197(2024)03-0058-07基于周期采样的分布式动态事件触发优化算法夏伦超1,韦梦立2,季秋桐2,赵中原1(1.南京信息工程大学自动化学院,江苏南京210044;2.东南大学网络空间安全学院,江苏南京211189)摘要:针对无向图下多智能体系统的优化问题,提出一种基于周期采样机制的分布式零梯度和优化算法,并设计一种新的动态事件触发策略㊂该策略中加入与历史时刻智能体状态相关的动态变量,有效降低了系统通信量;所提出的算法允许采样周期任意大,并考虑了通信延时的影响,利用Lyapunov 稳定性理论推导出算法收敛的充分条件㊂数值仿真进一步验证了所提算法的有效性㊂关键词:分布式优化;多智能体系统;动态事件触发;通信时延中图分类号:TP273文献标志码:ADistributed dynamic event triggerring optimizationalgorithm based on periodic samplingXIA Lunchao 1,WEI Mengli 2,JI Qiutong 2,ZHAO Zhongyuan 1(1.College of Automation,Nanjing University of Information Science and Technology,Nanjing 210044,China;2.School of Cyber Science and Engineering,Southeast University,Nanjing 211189,China)Abstract :A distributed zero-gradient-sum optimization algorithm based on a periodic sampling mechanism is proposed to address the optimization problem of multi-agent systems under undirected graphs.A novel dynamic event-triggering strategy is designed,which incorporates dynamic variables as-sociated with the historical states of the agents to effectively reduce the system communication overhead.Moreover,the algorithm allows for arbitrary sampling periods and takes into consideration the influence oftime delay.Finally,sufficient conditions for the convergence of the algorithm are derived by utilizing Lya-punov stability theory.The effectiveness of the proposed algorithm is further demonstrated through numer-ical simulations.Keywords :distributed optimization;multi-agent systems;dynamic event-triggered;time delay ㊀㊀近些年,多智能体系统的分布式优化问题因其在多机器人系统的合作㊁智能交通系统的智能运输系统和微电网的分布式经济调度等诸多领域的应用得到了广泛的研究[1-3]㊂如今,已经提出各种分布式优化算法㊂文献[4]提出一种结合负反馈和梯度流的算法来解决平衡有向图下的无约束优化问题;文献[5]提出一种基于自适应机制的分布式优化算法来解决局部目标函数非凸的问题;文献[6]设计一种抗干扰的分布式优化算法,能够在具有未知外部扰动的情况下获得最优解㊂然而,上述工作要求智能体与其邻居不断地交流,这在现实中会造成很大的通信负担㊂文献[7]首先提出分布式事件触发控制器来解决多智能体系统一致性问题;事件触发机制的核心是设计一个基于误差的触发条件,只有满足触发条件时智能体间才进行通信㊂文献[8]提出一种基于通信网络边信息的事件触发次梯度优化㊀算法,并给出了算法的指数收敛速度㊂文献[9]提出一种基于事件触发机制的零梯度和算法,保证系统状态收敛到最优解㊂上述事件触发策略是静态事件触发策略,即其触发阈值仅与智能体的状态相关,当智能体的状态逐渐收敛时,很容易满足触发条件并将生成大量不必要的通信㊂因此,需要设计更合理的触发条件㊂文献[10]针对非线性系统的增益调度控制问题,提出一种动态事件触发机制的增益调度控制器;文献[11]提出一种基于动态事件触发条件的零梯度和算法,用于有向网络的优化㊂由于信息传输的复杂性,时间延迟在实际系统中无处不在㊂关于考虑时滞的事件触发优化问题的文献很多㊂文献[12]研究了二阶系统的凸优化问题,提出时间触发算法和事件触发算法两种分布式优化算法,使得所有智能体协同收敛到优化问题的最优解,并有效消除不必要的通信;文献[13]针对具有传输延迟的多智能体系统,提出一种具有采样数据和时滞的事件触发分布式优化算法,并得到系统指数稳定的充分条件㊂受文献[9,14]的启发,本文提出一种基于动态事件触发机制的分布式零梯度和算法,与使用静态事件触发机制的文献[15]相比,本文采用动态事件触发机制可以避免智能体状态接近最优值时频繁触发造成的资源浪费㊂此外,考虑到进行动态事件触发判断需要一定的时间,使用当前状态值是不现实的,因此,本文使用前一时刻状态值来构造动态事件触发条件,更符合逻辑㊂由于本文采用周期采样机制,这进一步降低了智能体间的通信频率,但采样周期过长会影响算法收敛㊂基于文献[14]的启发,本文设计的算法允许采样周期任意大,并且对于有时延的系统,只需要其受采样周期的限制,就可得到保证多智能体系统达到一致性和最优性的充分条件㊂最后,通过对一个通用示例进行仿真,验证所提算法的有效性㊂1㊀预备知识及问题描述1.1㊀图论令R表示实数集,R n表示向量集,R nˑn表示n ˑn实矩阵的集合㊂将包含n个智能体的多智能体系统的通信网络用图G=(V,E)建模,每个智能体都视为一个节点㊂该图由顶点集V={1,2, ,n}和边集E⊆VˑV组成㊂定义A=[a ij]ɪR nˑn为G 的加权邻接矩阵,当a ij>0时,表明节点i和节点j 间存在路径,即(i,j)ɪE;当a ij=0时,表明节点i 和节点j间不存在路径,即(i,j)∉E㊂D=diag{d1, ,d n}表示度矩阵,拉普拉斯矩阵L等于度矩阵减去邻接矩阵,即L=D-A㊂当图G是无向图时,其拉普拉斯矩阵是对称矩阵㊂1.2㊀凸函数设h i:R nңR是在凸集ΩɪR n上的局部凸函数,存在正常数φi使得下列条件成立[16]:h i(b)-h i(a)- h i(a)T(b-a)ȡ㊀㊀㊀㊀φi2 b-a 2,∀a,bɪΩ,(1)h i(b)- h i(a)()T(b-a)ȡ㊀㊀㊀㊀φi b-a 2,∀a,bɪΩ,(2) 2h i(a)ȡφi I n,∀aɪΩ,(3)式中: h i为h i的一阶梯度, 2h i为h i的二阶梯度(也称黑塞矩阵)㊂1.3㊀问题描述考虑包含n个智能体的多智能体系统,假设每个智能体i的成本函数为f i(x),本文的目标是最小化以下的优化问题:x∗=arg minxɪΩðni=1f i(x),(4)式中:x为决策变量,x∗为全局最优值㊂1.4㊀主要引理引理1㊀假设通信拓扑图G是无向且连通的,对于任意XɪR n,有以下关系成立[17]:X T LXȡαβX T L T LX,(5)式中:α是L+L T2最小的正特征值,β是L T L最大的特征值㊂引理2(中值定理)㊀假设局部成本函数是连续可微的,则对于任意实数y和y0,存在y~=y0+ω~(y -y0),使得以下不等式成立:f i(y)=f i(y0)+∂f i∂y(y~)(y-y0),(6)式中ω~是正常数且满足ω~ɪ(0,1)㊂2㊀基于动态事件触发机制的分布式优化算法及主要结果2.1㊀考虑时延的分布式动态事件触发优化算法本文研究具有时延的多智能体系统的优化问题㊂为了降低智能体间的通信频率,提出一种采样周期可任意设计的分布式动态事件触发优化算法,95第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀夏伦超,等:基于周期采样的分布式动态事件触发优化算法其具体实现通信优化的流程图如图1所示㊂首先,将邻居和自身前一触发时刻状态送往控制器(本文提出的算法),得到智能体的状态x i (t )㊂然后,预设一个固定采样周期h ,使得所有智能体在同一时刻进行采样㊂同时,在每个智能体上都配置了事件检测器,只在采样时刻检查是否满足触发条件㊂接着,将前一采样时刻的智能体状态发送至构造的触发器中进行判断,当满足设定的触发条件时,得到触发时刻的智能体状态x^i (t )㊂最后,将得到的本地状态x^i (t )用于更新自身及其邻居的控制操作㊂由于在实际传输中存在时延,因此需要考虑满足0<τ<h 的时延㊂图1㊀算法实现流程图考虑由n 个智能体构成的多智能体系统,其中每个智能体都能独立进行计算和相互通信,每个智能体i 具有如下动态方程:x ㊃i (t )=-1h2f i (x i )()-1u i (t ),(7)式中u i (t )为设计的控制算法,具体为u i (t )=ðnj =1a ij x^j (t -τ)-x ^i (t -τ)()㊂(8)㊀㊀给出设计的动态事件触发条件:θi d i e 2i (lh )-γq i (lh -h )()ɤξi (lh ),(9)q i (t )=ðnj =1a ij x^i (t -τ)-x ^j (t -τ)()2,(10)㊀㊀㊀ξ㊃i (t )=1h[-μi ξi (lh )+㊀㊀㊀㊀㊀δi γq i (lh -h )-d i e 2i (lh )()],(11)式中:d i 是智能体i 的入度;γ是正常数;θi ,μi ,δi 是设计的参数㊂令x i (lh )表示采样时刻智能体的状态,偏差变量e i (lh )=x i (lh )-x^i (lh )㊂注释1㊀在进行动态事件触发条件设计时,可以根据不同的需求为每个智能体设定不同的参数θi ,μi ,δi ,以确保其能够在特定的情境下做出最准确的反应㊂本文为了方便分析,选择为每个智能体设置相同的θi ,μi ,δi ,以便更加清晰地研究其行为表现和响应能力㊂2.2㊀主要结果和分析由于智能体仅在采样时刻进行事件触发条件判断,并在达到触发条件后才通信,因此有x ^i (t -τ)=x^i (lh )㊂定理1㊀假设无向图G 是连通的,对于任意i ɪV 和t >0,当满足条件(12)时,在算法(7)和动态事件触发条件(9)的作用下,系统状态趋于优化解x ∗,即lim t ңx i (t )=x ∗㊂12-β2φm α-τβ2φm αh -γ>0,μi+δi θi <1,μi-1-δi θi >0,ìîíïïïïïïïï(12)式中φm =min{φ1,φ2}㊂证明㊀对于t ɪ[lh +τ,(l +1)h +τ),定义Lyapunov 函数V (t )=V 1(t )+V 2(t ),其中:V 1(t )=ðni =1f i (x ∗)-f i (x i )-f ᶄi (x i )(x ∗-x i )(),V 2(t )=ðni =1ξi (t )㊂令E (t )=e 1(t ), ,e n (t )[]T ,X (t )=x 1(t ), ,x n (t )[]T ,X^(t )=x ^1(t ), ,x ^n (t )[]T ㊂对V 1(t )求导得V ㊃1(t )=1h ðni =1u i (t )x ∗-x i (t )(),(13)由于ðni =1ðnj =1a ij x ^j (t -τ)-x ^i (t -τ)()㊃x ∗=0成立,有V ㊃1(t )=-1hX T (t )LX ^(lh )㊂(14)6山东理工大学学报(自然科学版)2024年㊀由于㊀㊀X (t )=X (lh +τ)-(t -lh -τ)X ㊃(t )=㊀㊀㊀㊀X (lh )+τX ㊃(lh )+t -lh -τhΓ1LX^(lh )=㊀㊀㊀㊀X (lh )-τh Γ2LX^(lh -h )+㊀㊀㊀㊀(t -lh -τ)hΓ1LX^(lh ),(15)式中:Γ1=diag (f i ᶄᶄ(x ~11))-1, ,(f i ᶄᶄ(x ~1n ))-1{},Γ2=diag (f i ᶄᶄ(x ~21))-1, ,(f i ᶄᶄ(x ~2n))-1{},x ~1iɪ(x i (lh +τ),x i (t )),x ~2i ɪ(x i (lh ),x i (lh+τ))㊂将式(15)代入式(14)得㊀V ㊃1(t )=-1h E T (lh )LX ^(lh )-1hX ^T (lh )LX ^(lh )+㊀㊀㊀τh2Γ2X ^T (lh -h )L T LX ^(lh )+㊀㊀㊀(t -lh -τ)h2Γ1X ^T (lh )L T LX ^(lh )㊂(16)根据式(3)得(f i ᶄᶄ(x ~i 1))-1ɤ1φi,i =1, ,n ㊂即Γ1ɤ1φm I n ,Γ2ɤ1φmI n ,φm =min{φ1,φ2}㊂首先对(t -lh -τ)h2Γ1X ^T (lh )L T LX ^(lh )项进行分析,对于t ɪ[lh +τ,(l +1)h +τ),基于引理1和式(3)有(t -lh -τ)h2Γ1X ^T (lh )L T LX ^(lh )ɤβhφm αX ^T (lh )LX ^(lh )ɤβ2hφm αðni =1q i(lh ),(17)式中最后一项根据X^T (t )LX ^(t )=12ðni =1q i(t )求得㊂接着分析τh2Γ2X ^(lh -h )L T LX ^(lh ),根据引理1和杨式不等式有:τh2Γ2X ^T (lh -h )L T LX ^(lh )ɤ㊀㊀㊀㊀τβ2h 2φm αX ^T (lh -h )LX ^(lh -h )+㊀㊀㊀㊀τβ2h 2φm αX ^T (lh )LX ^(lh )ɤ㊀㊀㊀㊀τβ4h 2φm αðni =1q i (lh -h )+ðni =1q i (lh )[]㊂(18)将式(17)和式(18)代入式(16)得㊀V ㊃1(t )ɤβ2φm α+τβ4φm αh -12()1h ðni =1q i(lh )+㊀㊀㊀τβ4φm αh ðni =1q i (lh -h )+1h ðni =1d i e 2i(lh )㊂(19)根据式(11)得V ㊃2(t )=-ðni =1μih ξi(lh )+㊀㊀㊀㊀ðni =1δihγq i (lh -h )-d i e 2i (lh )()㊂(20)结合式(19)和式(20)得V ㊃(t )ɤ-12-β2φm α-τβ4φm αh ()1h ðni =1q i (lh )+㊀㊀㊀㊀τβ4φm αh 2ðn i =1q i (lh -h )+γh ðni =1q i (lh -h )-㊀㊀㊀㊀1h ðni =1(μi -1-δi θi)ξi (lh ),(21)因此根据李雅普诺夫函数的正定性以及Squeeze 定理得㊀V (l +1)h +τ()-V (lh +τ)ɤ㊀㊀㊀-12-β2φm α-τβ4φm αh()ðni =1q i(lh )+㊀㊀㊀τβ4φm αh ðni =1q i (lh -h )+γðni =1q i (lh -h )-㊀㊀㊀ðni =1(μi -1-δiθi)ξi (lh )㊂(22)对式(22)迭代得V (l +1)h +τ()-V (h +τ)ɤ㊀㊀-12-β2φm α-τβ2φm αh-γ()ðl -1k =1ðni =1q i(kh )+㊀㊀τβ4φm αh ðni =1q i (0h )-㊀㊀12-β2φm α-τβ4φm αh()ðni =1q i(lh )-㊀㊀ðlk =1ðni =1μi -1-δiθi()ξi (kh ),(23)进一步可得㊀lim l ңV (l +1)h -V (h )()ɤ㊀㊀㊀τβ4φm αh ðni =1q i(0h )-16第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀夏伦超,等:基于周期采样的分布式动态事件触发优化算法㊀㊀㊀ðni =1(μi -1-δi θi )ðl =1ξi (lh )-㊀㊀㊀12-β2φm α-τβ2φm αh-γ()ð l =1ðni =1q i(lh )㊂(24)由于q i (lh )ȡ0和V (t )ȡ0,由式(24)得lim l ң ðni =1ξi (lh )=0㊂(25)基于ξi 的定义和拉普拉斯矩阵的性质,可以得到每个智能体的最终状态等于相同的常数,即lim t ңx 1(t )= =lim t ңx n (t )=c ㊂(26)㊀㊀由于目标函数的二阶导数具有以下性质:ðni =1d f ᶄi (x i (t ))()d t =㊀㊀㊀㊀-ðn i =1ðnj =1a ij x ^j (t )-x ^i (t )()=㊀㊀㊀㊀-1T LX^(t )=0,(27)式中1=[1, ,1]n ,所以可以得到ðni =1f i ᶄ(x i (t ))=ðni =1f i ᶄ(x ∗i )=0㊂(28)联立式(26)和式(28)得lim t ңx 1(t )= =lim t ңx n (t )=c =x ∗㊂(29)㊀㊀定理1证明完成㊂当不考虑通信时延τ时,可由定理1得到推论1㊂推论1㊀假设通信图G 是无向且连通的,当不考虑时延τ时,对于任意i ɪV 和t >0,若条件(30)成立,智能体状态在算法(7)和触发条件(9)的作用下趋于最优解㊂14-n -1φm -γ>0,μi+δi θi <1,μi-1-δi θi >0㊂ìîíïïïïïïïï(30)㊀㊀证明㊀该推论的证明过程类似定理1,由定理1结果可得14-β2φm α-γ>0㊂(31)令λn =βα,由于λn 是多智能体系统的全局信息,因此每个智能体很难获得,但其上界可以根据以下关系来估计:λn ɤ2d max ɤ2(n -1),(32)式中d max =max{d i },i =1, ,n ㊂因此得到算法在没有时延情况下的充分条件:14-n -1φm -γ>0㊂(33)㊀㊀推论1得证㊂注释2㊀通过定理1得到的稳定性条件,可以得知当采样周期h 取较小值时,由于0<τ<h ,因此二者可以抵消,从而稳定性不受影响;而当采样周期h 取较大值时,τβ2φm αh项可以忽略不计,因此从理论分析可以得出允许采样周期任意大的结论㊂从仿真实验方面来看,当采样周期h 越大,需要的收剑时间越长,但最终结果仍趋于优化解㊂然而,在文献[18]中,采样周期过大会导致稳定性条件难以满足,即算法最终难以收敛,无法达到最优解㊂因此,本文提出的算法允许采样周期任意大,这一创新点具有重要意义㊂3㊀仿真本文对一个具有4个智能体的多智能体网络进行数值模拟,智能体间的通信拓扑如图2所示㊂采用4个智能体的仿真网络仅是为了初步验证所提算法的有效性㊂值得注意的是,当多智能体的数量增加时,算法的时间复杂度和空间复杂度会增加,但并不会影响其有效性㊂因此,该算法在更大规模的多智能体网络中同样适用㊂成本函数通常选择凸函数㊂例如,在分布式传感器网络中,成本函数为z i -x 2+εi x 2,其中x 表示要估计的未知参数,εi 表示观测噪声,z i 表示在(0,1)中均匀分布的随机数;在微电网中,成本函数为a i x 2+b i x +c i ,其中a i ,b i ,c i 是发电机成本参数㊂这两种情境下的成本函数形式不同,但本质上都是凸函数㊂本文采用论文[19]中的通用成本函数(式(34)),用于证明本文算法在凸函数上的可行性㊂此外,通信拓扑图结构并不会影响成本函数的设计,因此,本文的成本函数在分布式网络凸优化问题中具有通用性㊂g i (x )=(x -i )4+4i (x -i )2,i =1,2,3,4㊂(34)很明显,当x i 分别等于i 时,得到最小局部成本函数,但是这不是全局最优解x ∗㊂因此,需要使用所提算法来找到x ∗㊂首先设置重要参数,令φm =16,γ=0.1,θi =1,ξi (0)=5,μi =0.2,δi =0.2,26山东理工大学学报(自然科学版)2024年㊀图2㊀通信拓扑图x i (0)=i ,i =1,2,3,4㊂图3为本文算法(7)解决优化问题(4)时各智能体的状态,其中设置采样周期h =3,时延τ=0.02㊂智能体在图3中渐进地达成一致,一致值为全局最优点x ∗=2.935㊂当不考虑采样周期影响时,即在采样周期h =3,时延τ=0.02的条件下,采用文献[18]中的算法(10)时,各智能体的状态如图4所示㊂显然,在避免采样周期的影响后,本文算法具有更快的收敛速度㊂与文献[18]相比,由于只有当智能体i 及其邻居的事件触发判断完成,才能得到q i (lh )的值,因此本文采用前一时刻的状态值构造动态事件触发条件更符合逻辑㊂图3㊀h =3,τ=0.02时算法(7)的智能体状态图4㊀h =3,τ=0.02时算法(10)的智能体状态为了进一步分析采样周期的影响,在时延τ不变的情况下,选择不同的采样周期h ,其结果显示在图5中㊂对比图3可以看出,选择较大的采样周期则收敛速度减慢㊂事实上,这在算法(7)中是很正常的,因为较大的h 会削弱反馈增益并减少固定有限时间间隔中的控制更新次数,具体显示在图6和图7中㊂显然,当选择较大的采样周期时,智能体的通信频率显著下降,同时也会导致收敛速度减慢㊂因此,虽然采样周期允许任意大,但在收敛速度和通信频率之间需要做出权衡,以选择最优的采样周期㊂图5㊀h =1,τ=0.02时智能体的状态图6㊀h =3,τ=0.02时的事件触发时刻图7㊀h =1,τ=0.02时的事件触发时刻最后,固定采样周期h 的值,比较τ=0.02和τ=2时智能体的状态,结果如图8所示㊂显然,时延会使智能体找到全局最优点所需的时间更长,但由于其受采样周期的限制,最终仍可以对于任意有限延迟达成一致㊂图8㊀h =3,τ=2时智能体的状态36第3期㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀夏伦超,等:基于周期采样的分布式动态事件触发优化算法4 结束语本文研究了无向图下的多智能体系统的优化问题,提出了一种基于动态事件触发机制的零梯度和算法㊂该机制中加入了与前一时刻智能体状态相关的动态变量,避免智能体状态接近最优值时频繁触发产生的通信负担㊂同时,在算法和触发条件设计中考虑了采样周期的影响,在所设计的算法下,允许采样周期任意大㊂对于有时延的系统,在最大允许传输延迟小于采样周期的情况下,给出了保证多智能体系统达到一致性和最优性的充分条件㊂今后拟将本算法向有向图和切换拓扑图方向推广㊂参考文献:[1]杨洪军,王振友.基于分布式算法和查找表的FIR滤波器的优化设计[J].山东理工大学学报(自然科学版),2009,23(5):104-106,110.[2]CHEN W,LIU L,LIU G P.Privacy-preserving distributed economic dispatch of microgrids:A dynamic quantization-based consensus scheme with homomorphic encryption[J].IEEE Transactions on Smart Grid,2022,14(1):701-713.[3]张丽馨,刘伟.基于改进PSO算法的含分布式电源的配电网优化[J].山东理工大学学报(自然科学版),2017,31(6):53-57.[4]KIA S S,CORTES J,MARTINEZ S.Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication[J].Automatica,2015,55:254-264.[5]LI Z H,DING Z T,SUN J Y,et al.Distributed adaptive convex optimization on directed graphs via continuous-time algorithms[J]. IEEE Transactions on Automatic Control,2018,63(5):1434 -1441.[6]段书晴,陈森,赵志良.一阶多智能体受扰系统的自抗扰分布式优化算法[J].控制与决策,2022,37(6):1559-1566. [7]DIMAROGONAS D V,FRAZZOLI E,JOHANSSON K H.Distributed event-triggered control for multi-agent systems[J].IEEE Transactions on Automatic Control,2012,57(5):1291-1297.[8]KAJIYAMA Y C,HAYASHI N K,TAKAI S.Distributed subgradi-ent method with edge-based event-triggered communication[J]. IEEE Transactions on Automatic Control,2018,63(7):2248 -2255.[9]LIU J Y,CHEN W S,DAI H.Event-triggered zero-gradient-sum distributed convex optimisation over networks with time-varying topol-ogies[J].International Journal of Control,2019,92(12):2829 -2841.[10]COUTINHO P H S,PALHARES R M.Codesign of dynamic event-triggered gain-scheduling control for a class of nonlinear systems [J].IEEE Transactions on Automatic Control,2021,67(8): 4186-4193.[11]CHEN W S,REN W.Event-triggered zero-gradient-sum distributed consensus optimization over directed networks[J].Automatica, 2016,65:90-97.[12]TRAN N T,WANG Y W,LIU X K,et al.Distributed optimization problem for second-order multi-agent systems with event-triggered and time-triggered communication[J].Journal of the Franklin Insti-tute,2019,356(17):10196-10215.[13]YU G,SHEN Y.Event-triggered distributed optimisation for multi-agent systems with transmission delay[J].IET Control Theory& Applications,2019,13(14):2188-2196.[14]LIU K E,JI Z J,ZHANG X F.Periodic event-triggered consensus of multi-agent systems under directed topology[J].Neurocomputing, 2020,385:33-41.[15]崔丹丹,刘开恩,纪志坚,等.周期事件触发的多智能体分布式凸优化[J].控制工程,2022,29(11):2027-2033. [16]LU J,TANG C Y.Zero-gradient-sum algorithms for distributed con-vex optimization:The continuous-time case[J].IEEE Transactions on Automatic Control,2012,57(9):2348-2354. [17]LIU K E,JI Z J.Consensus of multi-agent systems with time delay based on periodic sample and event hybrid control[J].Neurocom-puting,2016,270:11-17.[18]ZHAO Z Y.Sample-baseddynamic event-triggered algorithm for op-timization problem of multi-agent systems[J].International Journal of Control,Automation and Systems,2022,20(8):2492-2502.[19]LIU J Y,CHEN W S.Distributed convex optimisation with event-triggered communication in networked systems[J].International Journal of Systems Science,2016,47(16):3876-3887.(编辑:杜清玲)46山东理工大学学报(自然科学版)2024年㊀。
深度强化学习中的策略优化方法解析(十)
深度强化学习(Deep Reinforcement Learning, DRL)是一种人工智能领域中的新兴技术,它通过模拟智能体在环境中的交互学习,以达到完成特定任务的目的。
在深度强化学习中,策略优化是一个至关重要的问题,它直接影响着智能体在环境中的表现和学习效率。
本文将对深度强化学习中的策略优化方法进行分析和解析。
一、策略梯度方法在深度强化学习中,策略优化的一种重要方法是策略梯度方法。
策略梯度方法通过直接优化策略函数,使得智能体能够在环境中获得最大的长期奖励。
常见的策略梯度方法包括REINFORCE算法、Proximal Policy Optimization(PPO)算法等。
REINFORCE算法是一种基本的策略梯度方法,它通过采样轨迹来估计策略梯度,并利用梯度上升法来更新策略参数。
然而,REINFORCE算法存在着样本效率低、方差高等问题。
为了解决这些问题,近年来PPO算法逐渐成为了深度强化学习中的热门算法。
PPO算法通过限制策略更新的幅度,有效地提高了策略优化的效率和稳定性。
二、基于值函数的方法除了策略梯度方法外,深度强化学习中的策略优化方法还包括基于值函数的方法。
值函数是对状态或状态动作对的价值进行估计的函数。
常见的基于值函数的策略优化方法包括Q-learning算法、Actor-Critic算法等。
Q-learning算法是一种基于值函数的策略优化方法,它通过迭代更新动作值函数来最大化长期奖励。
然而,Q-learning算法在面对连续动作空间和高维状态空间时存在着挑战。
为了解决这些问题,Actor-Critic算法应运而生。
Actor-Critic算法将值函数估计和策略改进结合起来,通过利用值函数的信息来指导策略的优化,从而提高了深度强化学习的效率和稳定性。
三、策略优化的挑战与未来展望在深度强化学习中,策略优化面临着许多挑战。
首先,样本效率低、方差高是策略梯度方法的主要问题,而基于值函数的方法则面临着样本复杂度高、收敛速度慢的挑战。
Visual Ranking 的功能
Visual Ranking 的功能今天学习了排序工具Visual Ranking ,对教学过程很有,推荐给大家。
支持高层次思维“人是如何学习的指标性的书籍”“How People Learn”(Bransford, Brown, and Cocking, 2000)书中提到一位资深历史教师如何运用排序清单的过程激发学生高层次的思维。
要求九年级的学生列出历史上重要的手工艺创作,在一张大型海报上汇整学生的反应,并比较之。
经过一年,学生回归其清单,不但获得新知识且思维更为仔细。
他们清楚地说明和讨论形成清单排序基础的历史意义,并修正及精致化有关历史意义的判定准则。
以历史意义来排序手工艺创作和讨论排序的准则,学生不只是记忆事实和历史事件的发生日期,他们发展了对历史本质解释的深入理解,这正是此学科的基本教育理念。
虽然他们使用海报取代网络化工具来制作排序清单,但他们基本上与Visual Ranking所支持的学习活动是相同的,这些活动涵盖了Benjamin Bloom分类学习上重要智慧行为的全部认知和情意技能(Bloom, 1956)。
Bloom 的学习目标分类包含认知领域的六层次和情意领域的五层次。
(见下图)认知领域的技能发展Visual Ranking 鼓励学生培养Bloom学习目标各层次的技能:知识:学生可以用此工具建立一系列项目的排序,如依时序发生的事件(例如:一张罚单如何变成一条法律;细胞减数分裂的步骤),或按照量尺度量物体(例如:九大行星与太阳的距离;各种惰性气体的密度)。
理解和应用:课程活动可能包括依照客观计算或解释数据排序项目的任务,而每个项目的意见说明框则是列出学生理解的内容。
例如,学生可使用人口和区域的数据排序人口密度、依百公里汽油里程数和排气污染数据来排名汽车,或依照政治人物对关键议题的投票记录来评分其表现。
分析:此层次的技能——组织、辨别差异、比较和对照——正是此工具的重点。
为了依量尺排序项目,学生不但须了解要比较的项目,还须了解量尺的本质及将项目置于量尺上之评估标准。
看书和写作业英语
Reading books and doing homework are essential activities for students to enhance their knowledge and skills.Here are some tips and insights into these activities:1.Choosing the Right Books:Select books that align with your interests and academic requirements.Diverse genres can broaden your perspective and improve your understanding of various subjects.2.Setting a Reading Schedule:Allocate specific times for reading to develop a habit. Consistency is key to retaining information and improving reading speed.3.Active Reading:Engage with the text by taking notes,underlining important points, and asking questions.This helps in better comprehension and recall.4.Vocabulary Expansion:Encountering new words while reading is an opportunity to expand your vocabulary.Make a list of unfamiliar words and look up their meanings to enhance your language skills.5.Critical Thinking:Analyze the authors arguments,the narrative structure,and the themes presented in the book.This practice improves critical thinking and analytical skills.6.Relating to Real Life:Draw connections between what you read and realworld situations.This helps in applying theoretical knowledge to practical scenarios.7.Homework Prioritization:Start your homework by prioritizing tasks based on deadlines and difficulty levels.This ensures that you manage your time effectively.8.Understanding the Assignment:Before starting,make sure you understand the requirements of the assignment.If unclear,seek clarification from your teacher or peers.anized Work Space:Create a clean and organized workspace to minimize distractions and improve focus.10.Breaking Down Tasks:Break large assignments into smaller,manageable tasks.This makes the work seem less daunting and helps in maintaining momentum.11.Research Skills:Develop strong research skills for assignments that require gathering e reliable sources and practice proper citation.12.Peer Collaboration:Working with classmates can provide different perspectives andinsights.However,ensure that collaboration does not turn into plagiarism.13.Proofreading:Always review your work for grammatical errors,inconsistencies,and clarity of expression.Proofreading is crucial for producing highquality assignments.ing Technology:Utilize educational apps,online resources,and tools to assist with your homework.These can provide additional explanations,examples,and practice opportunities.15.Seeking Help:If youre struggling with a particular subject or assignment,dont hesitate to ask for help from teachers,tutors,or classmates.16.Reflecting on Learning:After completing reading or homework,reflect on what youve learned and how it can be applied or further explored.17.Balanced Approach:While academics are important,maintain a balance with other activities such as sports,hobbies,and socializing to ensure a wellrounded development.18.Healthy Habits:Maintaining good health habits like regular exercise,a balanced diet, and adequate sleep can significantly impact your ability to focus and learn effectively.19.Goal Setting:Set shortterm and longterm goals for your reading and homework.This provides direction and motivation to stay on track.20.Rewarding Yourself:After completing a task or reaching a milestone,reward yourself with a break or a small treat.This positive reinforcement can boost your motivation. Incorporating these strategies can make your reading and homework sessions more productive and enjoyable.Remember,the key to academic success is a combination of consistent effort,effective strategies,and a positive attitude.。
看书和写作业英语
Reading books and doing homework are essential activities for students to enhance their knowledge and skills.Here are some tips and insights into these activities:1.Choosing the Right Books:Select books that align with your interests and academic requirements.Diverse genres can broaden your perspective and improve your understanding of various subjects.2.Setting a Reading Schedule:Allocate specific times for reading to develop a habit. Consistency is key to retaining information and improving reading speed.3.Active Reading:Engage with the text by taking notes,underlining important points, and asking questions.This helps in better comprehension and recall.4.Vocabulary Expansion:Encountering new words while reading is an opportunity to expand your vocabulary.Make a list of unfamiliar words and look up their meanings to enhance your language skills.5.Critical Thinking:Analyze the authors arguments,the narrative structure,and the themes presented in the book.This practice improves critical thinking and analytical skills.6.Relating to Real Life:Draw connections between what you read and realworld situations.This helps in applying theoretical knowledge to practical scenarios.7.Homework Prioritization:Start your homework by prioritizing tasks based on deadlines and difficulty levels.This ensures that you manage your time effectively.8.Understanding the Assignment:Before starting,make sure you understand the requirements of the assignment.If unclear,seek clarification from your teacher or peers.anized Work Space:Create a clean and organized workspace to minimize distractions and improve focus.10.Breaking Down Tasks:Break large assignments into smaller,manageable tasks.This makes the work seem less daunting and helps in maintaining momentum.11.Research Skills:Develop strong research skills for assignments that require gathering e reliable sources and practice proper citation.12.Peer Collaboration:Working with classmates can provide different perspectives andinsights.However,ensure that collaboration does not turn into plagiarism.13.Proofreading:Always review your work for grammatical errors,inconsistencies,and clarity of expression.Proofreading is crucial for producing highquality assignments.ing Technology:Utilize educational apps,online resources,and tools to assist with your homework.These can provide additional explanations,examples,and practice opportunities.15.Seeking Help:If youre struggling with a particular subject or assignment,dont hesitate to ask for help from teachers,tutors,or classmates.16.Reflecting on Learning:After completing reading or homework,reflect on what youve learned and how it can be applied or further explored.17.Balanced Approach:While academics are important,maintain a balance with other activities such as sports,hobbies,and socializing to ensure a wellrounded development.18.Healthy Habits:Maintaining good health habits like regular exercise,a balanced diet, and adequate sleep can significantly impact your ability to focus and learn effectively.19.Goal Setting:Set shortterm and longterm goals for your reading and homework.This provides direction and motivation to stay on track.20.Rewarding Yourself:After completing a task or reaching a milestone,reward yourself with a break or a small treat.This positive reinforcement can boost your motivation. Incorporating these strategies can make your reading and homework sessions more productive and enjoyable.Remember,the key to academic success is a combination of consistent effort,effective strategies,and a positive attitude.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
280IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 2, APRIL 2011Optimizing Visual Search Reranking via Pairwise LearningYuan Liu and Tao Mei, Member, IEEEbe merely judged by the text-based approaches, as textual information may fail to precisely describe the visual content. For example, when users search for images with a warm color, the images cannot be easily measured by any textual description. To address this issue, visual search reranking has received increasing attention in recent years [7], [12], [18], [19], [21], [23], [27], [33], [37], [39]. It can be defined as reordering visual documents based on the initial search results or some auxiliary knowledge, aiming to improve search precision [21]. The research on visual search reranking has proceeded along three dimensions from the perspective of how external knowledge is exploited: 1) self-reranking [7], [12], [18], [23], [27], [33], [39], which mainly focuses on detecting relevant patterns (recurrent or dominant patterns) from the initial search results without any external knowledge; 2) example-reranking [21], [37], in which the query examples are provided by users so that the relevant patterns can be discovered from these examples; and 3) crowdreranking [19], which mines relevant patterns from the crowdsourcing knowledge available on the web. The first dimension, i.e., self-reranking, although relies little on the external knowledge, cannot deal with the “ambiguity problem” which is derived from the text queries. Taking the query “jaguar” as an example, the search system cannot determine what the user is really searching for, whether it is “an animal” or “a car.” As illustrated in Fig. 1(a), results with different meanings but all related to “jaguar” can be found in the top-ranked results of “jaguar.” A similar observation can be found in Fig. 1(b), in which diverse images with the surrounding text of “apple” are mixed in the search results of “apple.” To address this problem, the second and the third dimensions leverage some auxiliary knowledge to better understand the query. Specifically, the second dimension, i.e., example-reranking, leverages a few query examples to train the reranking models. However, the typical model-based approaches usually assume the availability of a large collection of training data, which cannot be satisfied as users are reluctant to provide enough query examples while searching. To address the limitation of lack of query examples, the third dimension, i.e., crowd-reranking, leverages crowdsourcing knowledge collected from multiple search engines. It is reported that much higher improvements can be obtained since different engines can inform and complement the relevant visual information for the given query. However, it still cannot avoid the ambiguity problem as current visual search engines mainly support the text query. To summarize, on one hand, in the example-reranking, insufficient user-provided examples will limit the applications as it only works well with a large collection of training data. On the other, self-reranking and crowd-reranking still cannot deal withAbstract—Visual search reranking is defined as reordering visual documents (images or video clips) based on the initial search results or some auxiliary knowledge to improve the search precision. Conventional approaches to visual search reranking empirically take the “classification performance” as the optimization objective, in which each visual document is determined relevant or not, followed by a process of increasing the order of relevant documents. In this paper, we first show that the classification performance fails to produce a globally optimal ranked list, and then we formulate reranking as an optimization problem, in which a ranked list is globally optimal only if any arbitrary two documents in the list are correctly ranked in terms of relevance. This is different from existing approaches which simply classify a document as “relevant” or not. To find the optimal ranked list, we convert the individual documents to “document pairs,” each represented as a “ordinal relation.” Then, we find the optimal document pairs which can maximally preserve the initial rank order while simultaneously keeping the consistency with the auxiliary knowledge mined from query examples and web resources as much as possible. We develop two pairwise reranking methods, difference pairwise reranking (DP-reranking) and exclusion pairwise reranking (EP-reranking), to obtain the relevant relation of each document pair. Finally, a round robin criterion is explored to recover the final ranked list. We conducted comprehensive experiments on an automatic video search task over TRECVID 2005–2007 benchmarks, and showed consistent improvements over text search baseline and other reranking approaches. Index Terms—Optimization, reranking, visual search. pairwise learning, searchI. INTRODUCTION HE proliferation of digital capture devices and the explosive growth of community-contributed media contents have led to a surge of research activity in visual search [16]. Due to the great success of text document retrieval, most existing visual search systems rely entirely on the text associated with the visual documents (images or video clips), such as document title, description, automatic speech recognition (ASR) results from videos, and so on. However, visual relevance cannotManuscript received June 09, 2010; revised September 22, 2010; accepted December 17, 2010. Date of publication January 06, 2011; date of current version March 18, 2011. This work was performed when Y. Liu was visiting Microsoft Research Asia as a research intern. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Andrea Cavallaro. Y. Liu is with the Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei 230027, China (e-mail: tc@). T. Mei is with Microsoft Research Asia, Beijing 100190, China (e-mail: tmei@). Color versions of one or more of the figures in this paper are available online at . Digital Object Identifier 10.1109/TMM.2010.2103931T1520-9210/$26.00 © 2011 IEEELIU AND MEI: OPTIMIZING VISUAL SEARCH RERANKING VIA PAIRWISE LEARNING281TABLE I CLASSIFICATION AND RERANKINGTABLE II PERFORMANCE COMPARISON BETWEEN CLASSIFICATION AND RANKINGFig. 1. Examples of the top-ranked results from a commercial web image search engine: (I) top 30 original web search results; (II) query examples provided by a user; and (III) filtered web examples based on the original web search results and query examples.the ambiguity problem as current visual search engines only support the text query. To address these issues, in this paper, we will leverage both query examples and crowdsourcing knowledge simultaneously. Specifically, we first feed the text query to a visual web search engine and collect the visual documents along with the associated text. To avoid the ambiguity problem, we then use the query examples to filter the web results and get more clean “web examples.” Also taking Fig. 1 as the example, by using the two user-provided query examples, the web examples are filtered based on visual similarity [24] to better catch the user’s search intent. By using the filtered web examples, we will give motivation to directly optimize the ranking list in Section I-A. A. Motivation: Classification versus Ranking Most current approaches to visual search reranking take the “classification performance” as the optimization objective, in which the reranking is formulated as a binary classification problem to determine whether or not a visual document is relevant, and then increase the ranking order of the relevant documents. However, although some systems have obtaineda high search performance, it is known that an optimal classification performance cannot guarantee an optimal search performance [40]. Suppose we have a hypothesis space with the two hypothesis functions, and . The two hypotheses predict a ranking for a query over a document corpus. is the indicator of document : relevant, 0: irrelevant) to the query . By relevance ( using the example shown in Tables I and II, we can demonstrate that models which optimize for classification performance are not directly concerned with the ranking which is often measured by average precision (AP) [34]. When we learn the models which optimize for the classification performance (such as “accuracy” and “recall”), the objective is to learn a “threshold” such that documents scoring higher than the threshold can be classified as relevant and the docu,a ments scoring lower as irrelevant. Specifically, with threshold between the documents 8 and 9 gives two errors (documents 1–2 incorrectly classified as relevant), yielding both an , a threshold accuracy and recall of 0.80. Similarly, with between documents 3 and 4 gives three errors (i.e., documents 7–9 are incorrectly classified as relevant), yielding an accuracy and recall of 0.70 and 0.75, respectively. Therefore, a learning method which optimizes for classification performance would since it results in a higher accuracy and recall. Howchoose ever, this will lead to a suboptimal ranking measured by the AP scores and the number of pairs correctly ranked. In summary, conventional reranking approaches which regard reranking as classification problem fail to give a global optimal ranking list. To address this problem, we can redefine the reranking as guaranteeing the highest probability that each arbitrary document pair is correctly ranked in terms of relevance. Therefore, we consider the visual documents in a pairwise manner, with each pair being represented as ordinal score, and we formulate the reranking as an optimization problem to find an optimal document pair set. Specifically, the objective is to minimize three types of distance, i.e., ranking distance, knowledge distance, and smooth distance. First, the optimal pair set will maximally preserve the initial ranking order, in other words, the distance between initial pair set and reranked pair set (ranking distance) will be minimized. For this, we will give two pairwise distance measurements between two pair sets. Second, the reranked pairs should be consistent with the mined cues as282IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 2, APRIL 2011• We analyze the “ambiguity problem” in the visual search reranking by text query, and introduce a novel approach by leveraging the crowdsourcing knowledge and query examples to avoid these problems. • We theoretically formulate visual search reranking as a global optimization problem by considering the visual documents in a pairwise manner. Furthermore, under the framework, we develop two pairwise reranking methods and give detailed solutions. The remainder of this paper is organized as follows. Section II introduces the related work on visual search reranking. Section III presents the proposed approach. Section IV shows the experiments, followed by the conclusions in Section V. II. RELATED WORKFig. 2. Flowchart of pairwise reranking.much as possible, that is, the knowledge distance should be minimized simultaneously. The mined cues are represented by the concept relatedness, which is learned from the query examples and public crowdsourcing knowledge. Third, smooth distance is explored based on the assumption that if two pairs have similar characteristics (concept relatedness in this paper), then the corresponding ordinal score should be close as well and vice versa. To minimize the three distances simultaneously, we develop two reranking methods, i.e., exclusion pairwise reranking (EPreranking) and difference pairwise reranking (DP-reranking), and provide the solutions. B. Overview The objective of pairwise learning for visual search reranking is to obtain an optimal set of document pairs by mining the query examples and crowdsourcing knowledge. The flowchart is illustrated in Fig. 2, including two main components, i.e., learning and reranking. Given a textual query, an initial ranked list of the visual documents (taking “images” as the example) is obtained by text-based search techniques. Meanwhile, this query is fed to a web search engine to collect a set of image search results along with their surrounding texts. In the learning process, we first use the query examples to filter the web search results and get the relatively clean “web examples” based on visual similarity. We then extract the textual keywords and detect the concept confidences for web examples by means of the pre-trained concept detectors [24], as concepts can be often viewed as an effective representation, like keywords in the text domain [7], [21], [22]. Through analyzing the keywords and the concept confidence scores, we can mine the concept relatedness to the given query. In the reranking process, the initial ranked list is converted to image pairs and represented by the mined concept relatedness. Then, the reranking is formulized as an optimization problem to find an optimal pair set. Finally, the reranked list is recovered from such pair sets based on a “round robin criterion.” This paper makes the following three main contributions. • We argue that reranking should be targeting at a higher probability that any two documents are ranked correctly in terms of relevance, rather than simply classify each individual documents as relevant or not.There exists rich research on visual search reranking in recent years. We first give a brief survey on the related works about image and video search reranking. Additionally, the topics about query-concept mapping for visual search and learning to rank have also been attracting an increasing amount of attention. We also discuss the representative works for theses two topics. A. Visual Search Reranking As aforementioned, the research on image and video search reranking has proceeded along three dimensions from the perspective of the external knowledge used [19]: self-reranking, which requires no external knowledge; example-reranking, which is based on user-provided query examples; and crowd-reranking, which exploits online crowdsourcing knowledge. The first dimension, i.e., self-reranking, also called unsupervised-reranking [21] aims to detect recurrent patterns (viewed as the relevant cues), such as video stories, image patches, and high-level concepts/categories, in the initial search results without any external knowledge [7], [8], [12], [18], [23], [27], [33], [39]. For example, Hsu et al. formulate the reranking process as a random walk over a context graph, where video stories are nodes and the edges between them are weighted by multimodal similarities. The video stories with similar visual appearance and text descriptions are linked compactly, and thus, the documents ranked lower can be picked up if they are strongly linked with considerable top-ranked ones [12]. Fergus et al. first perform visual clustering on initial returned images by probabilistic latent semantic analysis (pLSA), and then learn the visual object category based on the visual clusters; finally, the images are reranked according to the distance to the learned categories [7]. To tackle the “ambiguity problem,” the second and third dimensions explore some auxiliary knowledge to better understand the query. Specifically, the second dimension, i.e., example-reranking, also called supervised-reranking [21], leverages a few query examples to train the reranking models [21], [37] or gives some suggestions to improve the search precision [41]. For example, Yan et al. and Schroff et al. view the query examples as pseudo-positives and the bottom-ranked initial results as pseudo-negatives. A reranking model is then built based on these samples by support vector machine (SVM) [37]. Liu et al. use the query examples to learn the relevant and irrelevantLIU AND MEI: OPTIMIZING VISUAL SEARCH RERANKING VIA PAIRWISE LEARNING283concepts for a given query, and then identify an optimal set of document pairs via an information theory. The final reranking list is directly recovered from this optimal pair set [21]. To leverage more examples, the third dimension, i.e., crowdreranking, uses online crowdsourcing knowledge obtained from public social networks [19]. For example, our recent work first constructs a set of visual words based on local image patches collected from multiple image search engines, explicitly detects the so-called salient and concurrent patterns among the visual words, and then theoretically formalizes the reranking as an optimization problem on the basis of the mined visual patterns [19]. To address the above issues in example-reranking and crowdreranking, in this paper, we will leverage query examples and crowdsourcing knowledge simultaneously in an efficient way.examples to filter the web results and get cleaner “web examples.” By combining the filtered visual web examples and associated text, we aim to obtain more robust related concepts. C. Learning to Rank Learning to rank is now intensively studied in both information retrieval and machine learning communities. It is to automatically create a ranking model by using labeled training data and machine learning techniques. Generally, existing methods for learning to rank fall into three paradigms. For a given query and a set of retrieved documents, pointwise approaches try to directly estimate the relevance label for each query-document pair. While one may show that these approaches are consistent for a variety of performance measures [4], they ignore relative information within collections of documents. Pairwise approaches, as proposed by [1], [11], and [13], take the relative nature of the scores into account by comparing pairs of documents. They ensure that we obtain the correct order of documents even in the case when we may not be able to obtain a good estimate of the ratings directly. Finally, listwise approaches, as proposed by [2], [9], [32], and [36], treat the ranking in its totality. In learning, it takes ranked lists as instances and trains a ranking function through the minimization of a listwise loss function defined on the predicted list and the ground truth list [36]. Each of those paradigms focuses on a different aspect of the dataset while largely ignoring others. Recently, Moon et al. proposed a new learning to rank algorithm, IntervalRank, by using isotonic regression to balance the tradeoff between the three paradigms [25]. In this paper, we are motivated by the framework of pairwise learning to rank, and present a pairwise reranking method by mining the relationship between pairs of documents in initial ranked list. Considering there is limited number of documents in initial ranked list, the listwise paradigm is not considered in this paper. III. VISUAL SEARCH RERANKING VIA PAIRWISE LEARNING A. Problem Formulation Suppose we have a document set with documents to be reranked, where , and is the th document in the initial ranked list. We convert the initial ranked , list to a pair set denotes that is ranked before in the initial where ranked list. is the size of the pair set , and . Let and denote the initial and the reranked pairwise ordinal ( for short), respectively. and score for the pair reflect the ordinal relation or relevance difference of and , i.e., the higher the score, the higher degree is ranked before . In the reranking problem, there are three factors that should be considered: 1) the initial ranking order should be preserved as this indicates the relevance information from a text perspective; 2) the ordinal relation of document pairs should be consistent with the learnt knowledge (i.e., concept relatedness); and 3) the ordinal relation among the document pairs should be smooth; in other words, a similar existing concept will lead to a similar ordinal relation (i.e., smooth assumption). Therefore, we canB. Query-Concept Mapping for Visual Search Visual search with a set of high-level concept detectors has attracted increasing attention in recent years [14], [17], [21], [22], [28], [31], [35]. Intuitively, if queries can be automatically mapped to related concepts, search performance will benefit significantly. For example, the “face” concept can benefit people-related queries and the “sky” concept can also be high weighted for outdoor-related queries. Motivated by these observations, the problem of discovering related concepts, also called “query-concept mapping,” has been focused on by many researchers recently. For example, Kennedy et al. mine the top-ranked and bottom-ranked search results to discover related concepts by measuring the mutual information [14]. The basic idea is that if a concept has high mutual information with the top-ranked results and low mutual information with the bottom-ranked results, it will be considered as a related concept. Avoiding the ambiguity problem, Li et al. and Liu et al. leverage a few query examples to find related concepts; specifically, Li et al. use the tf-idf-like scheme [17] and Liu et al. explore the mutual information measurement [21]. Both methods are motivated by the information-theoretic point of view, that is, the more query examples bear more information of a concept, the more the concept will be related to the corresponding query. These methods mentioned above, however, only leverage the visual information extracted from either the top-ranked results or the query examples. The other types of information, such as the text, are entirely neglected. To solve this problem, Wang et al. linearly combine the text and visual information extracted from the text query and visual examples, respectively [35]. However, most of the practical text queries are very short, often represented by one or two words or phrases, so it is difficult to obtain robust concept relatedness information from such a small amount of text. Besides, the ambiguous problem also cannot be avoided, for instance, the query “jaguar” may be related to both an “animal” and a “car,” but the two concepts have little relation to each other. In this paper, we first feed the text query to a public visual web search engine and collect the visual documents along with the associated text; to avoid the ambiguity problem, we use query284IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 2, APRIL 2011TABLE III DIFFERENT RERANKING PROCESSES WHEN PARAMETERS GO TO 0 AND 1formulate the reranking problem by minimizing the following energy function: (1) where and are -dimensional vectors, and is an -dimensional vector, with each element indicating the importance or relatedness of a concept to the given query, where is the size of the concept lexicon. corresponds to the ranking distance between the initial ranked list and reranked list, corresponds to the knowledge distance between the reranked list corresponds to the smooth and the learnt concepts, and distance among the document pairs in reranked list. The parameters , , and tune the contribution of the three distances, , , , . In Section IV we will and demonstrate that their combination is better than using each one individually. From (1), we can find that when the three parameters are set to 0 and 1, the reranking process will rely on different knowledge. We list the details in Table III. B. Discovery of Concepts Relatedness To discover concept relatedness, we leverage the query examples and online crowdsourcing knowledge. We first feed the text query in a public web search engine to collect web search results; we then use the query examples to filter the web search results via visual similarity and get cleaner “web examples.” Let denote the query example set provided by the user, and is the number of query examples. The final web example set is derived by (2) where is the L1 distance and is the fixed threshold estimated by the average of distances between each query example pair and it is defined as follows:denotes the number of query example pairs. where There are two methods for detecting concept relatedness to the given query, based on a pre-defined concept lexicon. • Use a set of pre-trained concept detectors over web examples. The detectors were trained SVMs over three visual features: color moments on a 5-by-5 grid, an edge distribution histogram, and wavelet textures. The confidence scores of the three SVM models over each visual document are then averaged to give the final concept detection confidence. The details of the features and concept detection can be found in [24], in which a set of concept detectors are built mainly based on the low-level visual features and SVM for “high-level feature detection task.” • Mine the surrounding text of web examples. The standard stemming and stop word removal [24] are first performed as a pre-process; then terms with the highest frequency are selected to form a keyword set and match the concepts in the lexicon; here Google Distance (GD) [3] is adopted to measure two textual words:(4) and are the numbers of images conwhere and , respectively, and is taining words and . These the number of images containing both numbers can be obtained by performing a search by textual words on the Google image search engine [10]. is the total number of images indexed in the Google search is used to avoid the extremely engine. The operation large value. We can see that GD is a measure of semantic interrelatedness derived from the number of hits returned by the Google search engine for a given set of keywords. Keywords with the same or similar meanings in a natural language sense tend to be “close” in the units of GD, while the words with dissimilar meanings tend to be departed far away from each other. If the two search terms never occur together on the same web page, but do occur separately, then the GD between them is infinite. If both terms always occur together, then their GD is zero. By combining the above two methods, the relatedness of the th concept to a given query, i.e., , is given by(5)(3)where is the confidence score of the concept of obtained from the pre-trained concept dethe web example tectors. is a parameter to tune the contribution of concept detectors and surrounding text. Empirically, a relatively lower would be more suitable for the concept detector with limited performance. More details about how to set will be discussed in Section IV. Note that all elements in should be normalized to .LIU AND MEI: OPTIMIZING VISUAL SEARCH RERANKING VIA PAIRWISE LEARNING285C. Distance Definitions This section discusses the three distances in (1), i.e., ranking , knowledge distance , and smooth distance distance . Unlike traditional ranking distances, which are calculated between two ranking lists by using the relevance scores of the “individual documents” [19], [33], our ranking distance is calculated by using the ordinal scores of the “document pairs” after converting the ranking list to a pair set. The following two strategies of ranking distances are introduced: • Ranking distance I: difference square (6) • Ranking distance II: accumulated exclusion (7) where is an operator often called “XOR,” determining whether the right and the left values are consistent or not . Let be the ordinal score for the document pair ; the function is defined as (8) The knowledge distance and smooth distance are defined by means of a set of high-level concepts. First, the document pairs with in the pair set are represented by a matrix each row indicating the concept relatedness of a document pair, denotes the relatedness of the th pair to the i.e., th concept , and is defined by the following logistic function [20]: (9) A logistic function is a common sigmoid curve. It can model the “S-shaped” curve (abbreviated S-curve) of growth of some population. The initial stage of growth is approximately exponential; then, as saturation begins, the growth slows, and at maturity, approaches 1 when growth stops. We can see is much higher than , while it approaches 0 when ) is much lower than . When is equal to , . Then following [19], the knowledge distance is defined as (10) denote the vector with entries . can be viewed as the approximate cosine similarity between the concept-based representation of document and the learned concept relatedness of the given query pair , since and belong to the range of . Cosine similarity is a measure of similarity between two vectors of n dimensions by finding the cosine of the angle between them. Based on the assumption that if the pair and have similar concept relatedness, then the corresponding ordinal score and Letshould be close as well and vice versa, the smooth distance is defined as (11) which is widely used in semi-supervised learning methods [42], [43], where is the similarity between and represented as the vector difference of concept relatedness; it is computed by using the Gaussian kernel (12) where is the scaling parameter, and it is often set by the median value of all distances between each two pairs and . D. Pair Optimization Based on the distance defined in Section III-C, we integrate them into (1), and have the following two kinds of objective reranking functions in the matrix way: • Reranking function using ranking distance I(13) We call this optimization problem as difference pairwise reranking (DP-reranking). We can obtain the solution of (13) as follows (proven in the Appendix): (14) where is an identity matrix whose diagonal elements are 1 , where and the others are 0. and is a diagonal matrix with its - -element . and are obtained by replacing the last eleis obtained by ment of and with zero, respectively. . replacing the last row of with Letting , we can see that (14) consists and . They correspond to the initial of two parts, i.e., search results and learnt concept relatedness, respectively, and both are smoothed by each other. Therefore, the reranking list can be viewed as the combination of the initial search results and the learnt external knowledge. , Let denote the initial ordinal score for the pair and . It can be given by which is made up of document the normalized order difference between the two documents (15) We can see that , as is ranked before in as defined in Section III-A • Reranking function using ranking distance II: see (16) at the bottom of the next page. We call this optimization problem exclusion pairwise reranking (EP-reranking). The。