机器学习第二章作业答案

合集下载

相关主题

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

8
If there is only one training sample, then the hypothesis space is 2 256
8
We set the
training sample for each of the attributes, so that each of the hypothesis space is halved. After the 8 training, it can converge to a single correct hypothesis. <female,blanck,short,Portuguese>,<female,blonde,tall,Indian> <male,brown,short,Portuguese>,<female,blonde,tall,Indian> <male,blanck,tall,Portuguese>,<female,blonde,tall,Indian> <male,blanck,short,US>,<female,blonde,tall,Indian> <male,blanck,short,Portuguese>,<male,blonde,tall,Indian> <male,blanck,short,Portuguese>,<female,black,tall,Indian> <male,blanck,short,Portuguese>,<female,blonde,short,Indian> <male,blanck,short,Portuguese>,<female,blonde,tall,US> (d) To express all of the concepts in the language, we need to expand the hypothesis space, so that every possible hypothesis is included, so that the space is far greater than 256, and so can not get the final convergence, because every one of the training sample, the vote without any effect, so there is no way to see the sample classification. So there is no optimal query sequence. 2.6 Proof: Every member of VS H , D satisfies the right-hand side of expression. Let h be an arbitrary member of VS H , D , then h is consistent with all training examples in D. Assuming of the expression, it means ( s S)(g G)(g h s) ( s S)(g G)(g h) (h s) .Hence, there does not exist g from h does not satisfy the right-hand side
minimize G. The only time hypotheses are added to G is when a negative example is presented, as stated in the algorithm 1. Therefore, a general strategy would be to present all positive training examples before negative training examples. Following this strategy, we arrive at a sum of 11. 11 is the minimum for this set of training examples since in the final state G contains 2 hypotheses, one more than the absolute minimum calculated above. 2.4 Ans: (a) The S boundary is: S = (4,6,3,5) as 4 x 6,3 y 5 . As is shown in the bellow graph1. (b) The G boundary is: G = (3,8,2,7) as 3 x 8, 2 y 7 . As is shown in the bellow graph1. (c) Suggest a query guaranteed to reduce the size of the version space, regardless of how the trainer classifies it. Eg: (7,6) Suggest one that will not. Eg: (5,4) (d) The smallest number of training examples. 4points The four points as following: (3,2,+) (5,9,+) (2,1,-) (6,10,-)
Machine Learning HW2
14S051053-汪洋
2.1 Ans: Since all occurrence of “φ” for an attribute of the hypothesis results in a hypothesis which does not accept any instance, all these hypotheses are equal to that one where attribute is “φ”. So the number of hypothesis is 4*3*3*3*3*3 +1 = 973. With the addition attribute Watercurrent, the number of instances = 3*2*2*2*2*2*3 = 288, the number of hypothesis = 4*3*3*3*3*3*4 +1 = 3889. Generally, the number of hypothesis = 4*3*3*3*3*3*(k+1)+1. 2.2 Ans: start: S0={<Ø,Ø,Ø,Ø,Ø,Ø>} G0={<?,?,?,?,?,?>} Sunny, Warm, High, Strong, Warm, Same, Yes, No, Rainy, Cold, Normal, Cool, Change Add Example 4 :<Sunny, Warm, High, Strong, Cool, Change, Yes> S1={<Sunny,Warm, High, Strong, Cool, Change>} G1={<?,?,?,?,?,?>} Add Example 3 :<Rainy, Cold, High, Strong, Warm, Change, No> S2={<Sunny,Warm, High, Strong, Cool, Change>} G2={<Sunny,?,?,?,?,?> <?,Warm,?,?,?,?> <?,?,?,?,Cool,?>} Add Example 2 :<Sunny, Warm, High, Strong, Warm, Same, Yes> S3={<Sunny,Warm, High, Strong, ?, ?>} G3={<Sunny,?,?,?,?,?> <?,Warm,?,?,?,?> <?,?,?,?,Cool,?>} Add Example 1 :<Sunny, Warm, Normal, Strong, Warm, Same, Yes> S4={<Sunny,Warm, ?, Strong, ?, ?>} G4={<Sunny,?,?,?,?,?> <?,Warm,?,?,?,?> } The final version space is the same because it contains all the hypotheses consistent with the set of training examples used. Since there is only one set of hypotheses which are consistent with any given set of training examples, the algorithm must arrive at that one set regardless of the order. The sum for the current order of training examples above is 14, counting the initial state. The minimum this sum can be for any set of training examples is: 2 * ( # examples + 1 ) #examples stand for number of examples For our example set size, the minimum would be 10. Since the hypothesis representation in use can only generalize an attribute with ?, there can be only one hypothesis in S at any one time. Hypotheses never get added to S since the specific hypothesis space relies on conjunctive expressions. In other words, generalizing S cannot be done by using an or condition. Consider an attribute A with possible values b, c, and d. If b and c were consistent with all positive training examples but d was not, the hypothesis in S must represent A as ? instead of having two hypotheses, one specifying b and the other c. This is the natural bias of the representation which allows classification of instances not previously encountered. This leads to considering how to
Biblioteka Baidu
Graph1 2.5 Ans: (a)Step1: S0 {<(Q Q Q Q ), (Q Q Q Q)>} G0 {<(? ? ? ?), (? ? ? ?)>} Step2: S1 {<(male brown tall US), (female black short US)> G1 {<(? ? ? ?), (? ? ? ?)>} Step3: S2 {<(male brown ? ?), (female black short US)> G2 {<(? ? ? ?), (? ? ? ?)>} Step4: S3 {<(male brown ? ?), (female black short US)> G3 {<(male ? ? ?), (? ? ? ?)> ， <? ? ? ?>,<? ? ? US>} Step5: S4 {<(male brown ? ?), (female ? short ?)> G4 {<(male ? ? ?), (? ? ? ?)>} (b) Suppose that each attribute in the hypothesis can be taken for two values. So the number of hypotheses consistent with the subject for example is （2*2*2*2）*（2*2*2*2） = 256 (c) The shortest sequence should be 8. So 2 256