基于文本挖掘的用户画像研究
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
基于文本挖掘的用户画像研究Users portrait research based on text mining
姓名:高玉龙
学号:11109051
学院:工学院
导师:孙浩军教授
专业:计算机应用技术
入学:2011/09/10
答辩:2014/05/30
学位论文原创性声明
本论文是我个人在导师指导下进行的工作研究及取得的研究成果。本论文中除了标注与致谢的地方以外,不包含其他机构或者其他作者的已发表或者已经撰写过的研究成果。对于本文的研究中做出过贡献的集体和个人或者提供过帮助的,均在论文中以明确的方式进行标明。本人完全意识到本声明的法律责任由本人承担。
作者签名:日期:年月日
学位论文使用授权声明
本人授权汕头大学保存本学位论文的电子和纸质文档,允许论文被借阅和查阅;学校可将本学位论文的全部或部分内容编入有关数据库进行检索,可以采用影印、缩印或其它复制手段保存和汇编论文;学校可以向国家有关部门或机构送交论文并授权其保存、借阅或上网公布本学位论文的全部或部分内容。对于保密的论文,按照保密的有关规定和程序处理。
本论文属于:保密(),在年解密后适用本授权声明。
不保密()。(请在以上括号内打“√”)
作者签名:导师签名:
日期:年月日日期:
摘要
随着互联网的快速发展与日益普及,网络营销的价值也逐渐得到重视与认可,因此电子商务得到迅猛发展,而电子商务网站也日益成为大多数人生活中的“必需品”。而且越来越多的商家也希望借助电子商务在互联网销售市场的激烈竞争中立于不败之地。电子商务的不断发展,用户行为的研究是影响其生存发展的重要因素,用户行为研究的好坏成了决定消费者去留的关键原因。随着互联网用户规模及电子商务市场的不断扩大,电子商务市场的竞争愈发激烈,把握企业的竞争优势,同时加强企业战略性分化发展的能力,保证企业的可持续性发展是所有电子商务企业的共同目标。电子商务企业为了优化在网站上的营销活动和网站运营开销,都要投入大量的资源进行网站用户行为分析。随着产业的迅速发展,电子商务企业需要采用更先进的手段对网站进行网站用户行为分析,并构建自己的用户画像,并且大多电子商务网站已经积累了足够多的用户消费行为信息来进行用户细分等相关分析。
因此,本文通过对我国主流电子商务网站的数据进行相关采集与研究,并提出了用户画像构建的研究策略,并提出将用户属性分为基础属性标签,行为属性标签,价值属性标签,社交属性标签。并采用概率与信息熵的方法对用户数据进行分词,采用层次分析法对用户价值属性进行分析,得到用户的价值属性标签,并通过定义相关的规则,以此来构建用户画像,并采用k-means对构建的用户画像进行聚类。
关键词:电子商务; 用户画像; 用户研究; 用户聚类分析; k-means
Abstract
With the rapid development of Internet, the value of network marketing has gradually received attention and recognition. Therefore e-commerce has been developed rapidly ,and the electronic commerce website has become a necessity for most people in life. More and more businesses also hope to stand firm and invincible in the fierce competition by means of Electronic Commerce on the Internet sales market .With the continuous development of e-commerce, the user behavior has become more and more important for its development, user behavior research has become the key reasons for retaining consumers .Along with the Internet user scale and the electronic commerce market continues to expand, the electronic commerce market competition is increasingly intense, grasp the competitive advantage of enterprises, and strengthen the ability of enterprise strategic development, guarantee the sustainable development of enterprises is the common goal of all enterprise electronic commerce. Electronic business enterprise have to invest a lot of resources website user behavior analysis in order to optimize the website marketing and website operation cost. With the rapid development of industry, e-commerce enterprises need to use more advanced methods for website user behavior analysis, and construct the user portrait of himself, and most of the electronic commerce website has accumulated enough consumer behavior information to do user segmentation.
Therefore, this paper does research on the mainstream e-commerce websites in China, and puts forward the research strategy of constructing user portrait, and divides the user attribute to basic attribute, attribute label, value attribute, and social attribute tag. And Using the method based the probability and information entropy to segment users’ data, using AHP to analysis the user attributes of value for getting users’ value attribute label, and defining the relevant rules, in order to build user portrait, and uses K-means to construct the user image clustering. Keywords: e-commerce; user portrait; user research; user clustering analysis; k-means