机器学习_Trains Data Set(火车数据集)

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Trains Data Set(火车数据集)

数据摘要:

2 data formats (structured, one-instance-per-line)

中文关键词:

多变量,分类,UCI,火车,

英文关键词:

Multivariate,Classification,UCI,Trains,

数据格式:

TEXT

数据用途:

This data set is used for classification.

数据详细介绍:

Trains Data Set Abstract: 2 data formats (structured, one-instance-per-line)

Source:

Original owners:

Ryszard S. Michalski (michalski '@' ) and Robert Stepp

Donor:

GMU, Center for AI, Software Librarian, Eric E. Bloedorn (bloedorn '@' )

Data Set Information:

Notes:

- Additional "background" knowledge is supplied that provides a partial ordering on some of the attribute values.

- We are providing this dataset both in its original form and in a form similar to the more typical propositional datasets in our repository. Since the trains dataset records relations between attributes, this transformation was somewhat challenging. However, it may shed some insight on this problem for people who are more familiar with the simple one-instance-per-line dataset format.

Hierarchy of values:

if (cshape is one of {openrect,opentrap,ushaped,dblopnrect}

then cshape is opentop

if (cshape is one of {hexagon,ellipse,closedrect,jaggedtop,slopetop, engine}

then cshape closedtop

Prediction task: Determine concise decision rules distinguishing trains traveling east from those traveling west.

Attribute Information:

The following format was used for the "transformed" dataset representation as found in trains.transformed.data (one instance per line):

1. Number_of_cars (integer in [3-5])

2. Number_of_different_loads (integer in [1-4])

3-22: 5 attributes for each of cars 2 through 5: (20 attributes total)

- num_wheels (integer in [2-3])

- length (short or long)

- shape (closedrect, dblopnrect, ellipse, engine, hexagon, jaggedtop, openrect, opentrap, slopetop, ushaped)

- num_loads (integer in [0-3])

- load_shape (circlelod, hexagonlod, rectanglod, trianglod)

23-32: 10 Boolean attributes describing whether 2 types of loads are on adjacent cars of the train

- Rectangle_next_to_rectangle (0 if false, 1 if true)

- Rectangle_next_to_triangle (0 if false, 1 if true)

- Rectangle_next_to_hexagon (0 if false, 1 if true)

- Rectangle_next_to_circle (0 if false, 1 if true)

- Triangle_next_to_triangle (0 if false, 1 if true)

- Triangle_next_to_hexagon (0 if false, 1 if true)

- Triangle_next_to_circle (0 if false, 1 if true)

- Hexagon_next_to_hexagon (0 if false, 1 if true)

- Hexagon_next_to_circle (0 if false, 1 if true)

- Circle_next_to_circle (0 if false, 1 if true)

33. Class attribute (east or west)

The number of cars vary between 3 and 5. Therefore, attributes referring to properties of cars that do not exist (such as the 5 attriubutes for the "5th" car when the train has fewer than 5 cars) are assigned a value of "-".

Relevant Papers:

R.S. Michalski and J.B. Larson "Inductive Inference of VL Decision Rules" In Proceedings of the Workshop in Pattern-Directed Inference Systems, Hawaii, May 1977.

[Web Link]

Stepp, R.E. and Michalski, R.S. "Conceptual Clustering: Inventing Goal-Oriented Classifications of Structured Objects" In R.S. Michalski, J.G. Carbonell, and T.M. Mitchell (Eds.) "Machine Learning: An Artificial Intelligence Approach, Volume II". Los Altos, Ca: Morgan Kaufmann.

[Web Link]

相关文档
最新文档