机器学习_Artificial Characters Data Set(手写字符)
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Artificial Characters Data Set(手写字符)
数据摘要:
Dataset artificially generated by using first order theory which describes structure of ten capital letters of English alphabet
中文关键词:
机器学习,手写字符,多变量,分类,UCI,
英文关键词:
Machine Learning,Artificial
Characters,MultiVarite,Classification,UCI,
数据格式:
TEXT
数据用途:
Classification
数据详细介绍:
Artificial Characters Data Set
Abstract: Dataset artificially generated by using first order theory which describes structure of ten capital letters of English alphabet
Source:
Original Owners of Database:
1. H. Altay Guvenir, PhD.,
Bilkent University,
Department of Computer Engineering and Information Science,
06533 Ankara, Turkey
Phone: +90 (312) 266 4133
Email: guvenir '@' .tr
2. Burak Acar, M.S.,
Bilkent University,
EE Eng. Dept.
06533 Ankara, Turkey
Email: buraka '@' .tr
3. Haldun Muderrisoglu, M.D., Ph.D.,
Baskent University,
School of Medicine
Ankara, Turkey
Donor:
H. Altay Guvenir
Bilkent University,
Department of Computer Engineering and Information Science,
06533 Ankara, Turkey
Phone: +90 (312) 266 4133
Email: guvenir '@' .tr
Data Set Information:
This database has been artificially generated by using a first order theory which describes the structure of ten capital letters of the English alphabet and a random choice theorem prover which accounts for etherogeneity in the instances. The capital letters represented are the following: A, C, D, E, F, G, H, L, P, R. Each instance is structured and is described by a set of segments (lines) which resemble the way an automatic program would segment an image. Each instance is stored in a separate file whose format is the following:
CLASS OBJNUM TYPE XX1 YY1 XX2 YY2 SIZE DIAG
where CLASS is an integer number indicating the class as described below, OBJNUM is an integer identifier of a segment (starting from 0) in the instance and the remaining columns represent attribute values. For further details, contact the author.
Attribute Information:
TYPE: the first attribute describes the type of segment and is always set to the string "line". Its C language type is char.
XX1,YY1,XX2,YY2: these attributes contain the initial and final coordinates of a segment in a cartesian plane. Their C language type is int.
SIZE: this is the length of a segment computed by using the geometric distance between two points A(X1,Y1) and B(X2,Y2). Its C language type is float.
DIAG: this is the length of the diagonal of the smallest rectangle which includes the picture of the character. The value of this attribute is the same in each object. Its C language type is float. Relevant Papers:
M. Botta, A. Giordana, L. Saitta: "Learning Fuzzy Concept Definitions", IEEE-Fuzzy Conference, 1993.
[Web Link]
M. Botta, A. Giordana: "Learning Quantitative Feature in a Symbolic Environment", LNAI 542, 1991, pp. 296-305.
[Web Link]