机器学习_Artificial Characters Data Set(手写字符)

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Artificial Characters Data Set(手写字符)

数据摘要:

Dataset artificially generated by using first order theory which describes structure of ten capital letters of English alphabet

中文关键词:

机器学习,手写字符,多变量,分类,UCI,

英文关键词:

Machine Learning,Artificial

Characters,MultiVarite,Classification,UCI,

数据格式:

TEXT

数据用途:

Classification

数据详细介绍:

Artificial Characters Data Set

Abstract: Dataset artificially generated by using first order theory which describes structure of ten capital letters of English alphabet

Source:

Original Owners of Database:

1. H. Altay Guvenir, PhD.,

Bilkent University,

Department of Computer Engineering and Information Science,

06533 Ankara, Turkey

Phone: +90 (312) 266 4133

Email: guvenir '@' .tr

2. Burak Acar, M.S.,

Bilkent University,

EE Eng. Dept.

06533 Ankara, Turkey

Email: buraka '@' .tr

3. Haldun Muderrisoglu, M.D., Ph.D.,

Baskent University,

School of Medicine

Ankara, Turkey

Donor:

H. Altay Guvenir

Bilkent University,

Department of Computer Engineering and Information Science,

06533 Ankara, Turkey

Phone: +90 (312) 266 4133

Email: guvenir '@' .tr

Data Set Information:

This database has been artificially generated by using a first order theory which describes the structure of ten capital letters of the English alphabet and a random choice theorem prover which accounts for etherogeneity in the instances. The capital letters represented are the following: A, C, D, E, F, G, H, L, P, R. Each instance is structured and is described by a set of segments (lines) which resemble the way an automatic program would segment an image. Each instance is stored in a separate file whose format is the following:

CLASS OBJNUM TYPE XX1 YY1 XX2 YY2 SIZE DIAG

where CLASS is an integer number indicating the class as described below, OBJNUM is an integer identifier of a segment (starting from 0) in the instance and the remaining columns represent attribute values. For further details, contact the author.

Attribute Information:

TYPE: the first attribute describes the type of segment and is always set to the string "line". Its C language type is char.

XX1,YY1,XX2,YY2: these attributes contain the initial and final coordinates of a segment in a cartesian plane. Their C language type is int.

SIZE: this is the length of a segment computed by using the geometric distance between two points A(X1,Y1) and B(X2,Y2). Its C language type is float.

DIAG: this is the length of the diagonal of the smallest rectangle which includes the picture of the character. The value of this attribute is the same in each object. Its C language type is float. Relevant Papers:

M. Botta, A. Giordana, L. Saitta: "Learning Fuzzy Concept Definitions", IEEE-Fuzzy Conference, 1993.

[Web Link]

M. Botta, A. Giordana: "Learning Quantitative Feature in a Symbolic Environment", LNAI 542, 1991, pp. 296-305.

[Web Link]

相关文档
最新文档