20 News Groups Dataset(20个新闻组数据集)

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Objections: This dataset is too well known and is in fact used as the example dataset for the rainbow software documentation.
ຫໍສະໝຸດ Baidu
数据预览:
点此下载完整数据集
20 News Groups Dataset(20 个新闻组数据集)
数据摘要:
This is a well known data set for text classification, used mainly for training classifiers by using both labeled and unlabeled data (see references below). The data set is a collection of 20,000 messages, collected from UseNet postings over a period of several months in 1993. The data are divided almost evenly among 20 different UseNet discussion groups. Many of the categories fall into overlapping topics; for example 5 of them are about companies discussion groups and 3 of them discuss religion. Other topics included in News Groups are: politics, sports, sciences and miscellanious.
20 News Groups Dataset
Description: This is a well known data set for text classification, used mainly for training classifiers by using both labeled and unlabeled data (see references below). The data set is a collection of 20,000 messages, collected from UseNet postings over a period of several months in 1993. The data are divided almost evenly among 20 different UseNet discussion groups. Many of the categories fall into overlapping topics; for example 5 of them are about companies discussion groups and 3 of them discuss religion. Other topics included in News Groups are: politics, sports, sciences and miscellanious.
中文关键词:
数据挖掘,新闻,文本分类,交叉主题,
英文关键词:
Data mining,News,Text Classification,Overlapping topics,
数据格式:
TEXT
数据用途:
The data can be used for text classification.
数据详细介绍:
相关文档
最新文档