Zookeeper源码（七）请求处理

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Zookeeper源码（七）请求处理
Zookeeper 源码（七）请求处理
以单机启动为例讲解 Zookeeper 是如何处理请求的。

先回顾⼀下单机时的请求处理链。

// 单机包含 3 个请求链：PrepRequestProcessor -> SyncRequestProcessor -> FinalRequestProcessor
protected void setupRequestProcessors() {
RequestProcessor finalProcessor = new FinalRequestProcessor(this);
RequestProcessor syncProcessor = new SyncRequestProcessor(this,
finalProcessor);
((SyncRequestProcessor)syncProcessor).start();
firstProcessor = new PrepRequestProcessor(this, syncProcessor);
((PrepRequestProcessor)firstProcessor).start();
}
请求的调⽤链如下：
PrepRequestProcessor.processRequest() <- ZooKeeperServer.submitRequest() <- ZooKeeperServer.processPacket() <- NettyServerCnxn.receiveMessage() <- CnxnChannelHandler.processMessage() <- CnxnChannelHandler.messageReceived RequestProcessor 接⼝
public interface RequestProcessor {
public static class RequestProcessorException extends Exception {
public RequestProcessorException(String msg, Throwable t) {
super(msg, t);
}
}
// 处理请求
void processRequest(Request request) throws RequestProcessorException;
// 关闭当前及⼦处理器，处理器可能是线程
void shutdown();
}
⼀、PrepRequestProcessor
PrepRequestProcessor 是服务器的请求预处理器，能够识别出当前客户端是否是事务请求，对于事务请求，进⾏⼀系列预处理，如创建请求事务头，事务体，会话检查，ACL
检查等。

(1) PrepRequestProcessor 构造函数
public class PrepRequestProcessor extends ZooKeeperCriticalThread implements RequestProcessor {
// 已提交请求队列
LinkedBlockingQueue<Request> submittedRequests = new LinkedBlockingQueue<Request>();
// 下个处理器
private final RequestProcessor nextProcessor;
// Zookeeper 服务器
ZooKeeperServer zks;
public PrepRequestProcessor(ZooKeeperServer zks, RequestProcessor nextProcessor) {
// 初始化线程
super("ProcessThread(sid:" + zks.getServerId() + " cport:"
+ zks.getClientPort() + "):", zks.getZooKeeperServerListener());
this.nextProcessor = nextProcessor;
this.zks = zks;
}
}
说明：类的核⼼属性有 submittedRequests 和 nextProcessor，前者表⽰已经提交的请求，⽽后者表⽰提交的下个处理器。

(2) RequestProcessor 接⼝实现
// 接收请求
public void processRequest(Request request) {
submittedRequests.add(request);
}
// 关闭线程
public void shutdown() {
("Shutting down");
submittedRequests.clear();
submittedRequests.add(Request.requestOfDeath);
nextProcessor.shutdown();
}
既然请求都提交到 submittedRequests 中了，必然有地⽅消费 submittedRequests，下⾯看⼀下线程的处理过程。

(3) run(核⼼)
public void run() {
try {
while (true) {
Request request = submittedRequests.take();
long traceMask = ZooTrace.CLIENT_REQUEST_TRACE_MASK;
if (request.type == OpCode.ping) { // 请求类型为 PING
traceMask = ZooTrace.CLIENT_PING_TRACE_MASK;
}
if (Request.requestOfDeath == request) { // 结束线程
break;
}
pRequest(request); // 处理请求(核⼼)
}
} catch (RequestProcessorException e) {
if (e.getCause() instanceof XidRolloverException) {
(e.getCause().getMessage());
}
handleException(this.getName(), e);
} catch (Exception e) {
handleException(this.getName(), e);
}
("PrepRequestProcessor exited loop!");
}
说明：run 函数是对 Thread 类 run 函数的重写，其核⼼逻辑相对简单，即不断从队列中取出 request 进⾏处理，其会调⽤ pRequest 函数，while ⾃旋这样做的好处是充分利⽤
CPU，避免线程频繁切换线程。

⼆、SyncRequestProcessor
在分析了 PrepRequestProcessor 处理器后，接着来分析 SyncRequestProcessor，该处理器将请求存⼊磁盘，其将请求批量的存⼊磁盘以提⾼效率，请求在写⼊磁盘之前是不
会被转发到下个处理器的。

SyncRequestProcessor 除了会定期的把 request 持久化到本地磁盘，同时他还要维护本机的 txnlog 和 snapshot，这⾥的基本逻辑是：
每隔 snapCount/2 个 request 会重新⽣成⼀个 snapshot 并滚动⼀次 txnlog，同时为了避免所有的 zookeeper server 在同⼀个时间⽣成 snapshot 和滚动⽇志，这⾥会再加上⼀个随机数，snapCount 的默认值是 10w 个 request
(1) 重要属性
public class SyncRequestProcessor extends ZooKeeperCriticalThread implements RequestProcessor {
private final ZooKeeperServer zks;
// queuedRequests 接收外界传递的请求队列
private final LinkedBlockingQueue<Request> queuedRequests = new LinkedBlockingQueue<Request>();
private final RequestProcessor nextProcessor;
// 快照处理线程
private Thread snapInProcess = null;
volatile private boolean running;
// 等待被刷新到磁盘的请求队列
private final LinkedList<Request> toFlush = new LinkedList<Request>();
private final Random r = new Random(System.nanoTime());
// 快照个数
private static int snapCount = ZooKeeperServer.getSnapCount();
// 关闭线程
private final Request requestOfDeath = Request.requestOfDeath;
}
(2) run(核⼼⽅法)
public void run() {
try {
// 1. 初始化，⽇志数量为 0
int logCount = 0;
// 确保所有的服务器在同⼀时间不是使⽤的同⼀个快照
int randRoll = r.nextInt(snapCount/2);
while (true) {
Request si = null;
// 2. 没有需要刷新到磁盘的请求，则 take 取出数据，会阻塞
if (toFlush.isEmpty()) {
si = queuedRequests.take();
// 3. 有则 poll 取出数据，不会阻塞
} else {
si = queuedRequests.poll();
// 没有请求则先将已有的请求刷新到磁盘
if (si == null) {
flush(toFlush);
continue;
}
}
if (si == requestOfDeath) {
break;
}
if (si != null) {
// 4. 将请求添加⾄⽇志⽂件，只有事务性请求才会返回 true
if (zks.getZKDatabase().append(si)) {
logCount++;
if (logCount > (snapCount / 2 + randRoll)) {
randRoll = r.nextInt(snapCount/2);
// 4.1 ⽣成滚动⽇志 roll the log
zks.getZKDatabase().rollLog();
// 4.2 ⽣成快照⽇志，如果 snapInProcess 线程仍在进⾏快照则忽略本次快照
if (snapInProcess != null && snapInProcess.isAlive()) {
LOG.warn("Too busy to snap, skipping");
} else {
snapInProcess = new ZooKeeperThread("Snapshot Thread") {
public void run() {
try {
zks.takeSnapshot();
} catch(Exception e) {
LOG.warn("Unexpected exception", e);
}
}
};
snapInProcess.start();
}
logCount = 0;
}
// 5. 查看此时 toFlush 是否为空，如果为空，说明近段时间读多写少，直接交给下⼀个处理器处理
} else if (toFlush.isEmpty()) {
if (nextProcessor != null) {
nextProcessor.processRequest(si);
if (nextProcessor instanceof Flushable) {
((Flushable)nextProcessor).flush();
}
}
continue;
}
toFlush.add(si);
if (toFlush.size() > 1000) {
flush(toFlush);
}
}
}
} catch (Throwable t) {
handleException(this.getName(), t);
} finally{
running = false;
}
("SyncRequestProcessor exited!");
}
(3) flush(刷新到磁盘)
private void flush(LinkedList<Request> toFlush) throws IOException, RequestProcessorException {
if (toFlush.isEmpty())
return;
// 1. 提交⾄ ZK 数据库
zks.getZKDatabase().commit();
// 2. 将所有的请求提交到下个处理器处理
while (!toFlush.isEmpty()) {
Request i = toFlush.remove();
if (nextProcessor != null) {
nextProcessor.processRequest(i);
}
}
if (nextProcessor != null && nextProcessor instanceof Flushable) {
// 刷新到磁盘
((Flushable)nextProcessor).flush();
}
}
说明：该函数主要⽤于将toFlush队列中的请求刷新到磁盘中。

三、FinalRequestProcessor
FinalRequestProcessor 负责把已经 commit 的写操作应⽤到本机，对于读操作则从本机中读取数据并返回给 client，这个 processor 是责任链中的最后⼀个FinalRequestProcessor 是⼀个同步处理的 processor，主要的处理逻辑就在⽅法 processRequest 中：
如果 request.hdr != null，则表明 request 是写操作，则调⽤ zks.processTxn(hdr, txn) 来把 request 关联的写操作执⾏到内存状态中
如果是写操作，则调⽤ zks.getZKDatabase().addCommittedProposal(request);
把 request 加⼊到 mittedLog 队列中，这个队列主要是为了快速和 follower 同步⽽保留的
为各类操作准备响应数据，对于写操作则根据 processTxn 的结果来回复，如果是读操作，则读取内存中的状态
发送响应数据给 client
processRequest 的处理逻辑⾮常长，我们⼀点点分析。

(1) 处理事务请求
public void processRequest(Request request) {
ProcessTxnResult rc = null;
synchronized (zks.outstandingChanges) {
// 1. 请求委托 ZookeeperServer 处理，zks 会针对事务和⾮事务请求会分别处理
rc = zks.processTxn(request);
// 2. request.hdr!=null 则是事务请求，即写操作，outstandingChanges 保存有所有的事务请求记录
// PrepRequestProcessor 会将事务请求添加到集合中，FinalRequestProcessor 则事务请求已经处理完毕需要移除
if (request.getHdr() != null) {
// 事务请求头
TxnHeader hdr = request.getHdr();
Record txn = request.getTxn();
long zxid = hdr.getZxid();
// zk 有严格的执⾏顺序，如果⼩于 zxid 则认为已经处理完毕
while (!zks.outstandingChanges.isEmpty()
&& zks.outstandingChanges.get(0).zxid <= zxid) {
ChangeRecord cr = zks.outstandingChanges.remove(0);
if (cr.zxid < zxid) {
LOG.warn("Zxid outstanding " + cr.zxid + " is less than current " + zxid);
}
if (zks.outstandingChangesForPath.get(cr.path) == cr) {
zks.outstandingChangesForPath.remove(cr.path);
}
}
}
// 3. 如果是事务请求，则把 request 加⼊到 mittedLog 队列中
if (request.isQuorum()) {
zks.getZKDatabase().addCommittedProposal(request);
}
}
}
processRequest 将请求委托给了 zk 处理，我们看⼀下 ZookeeperServer 是如何处理请求的。

public ProcessTxnResult processTxn(Request request) {
return processTxn(request, request.getHdr(), request.getTxn());
}
private ProcessTxnResult processTxn(Request request, TxnHeader hdr,
Record txn) {
ProcessTxnResult rc;
int opCode = request != null ? request.type : hdr.getType();
long sessionId = request != null ? request.sessionId : hdr.getClientId();
if (hdr != null) {
// 写操作(事务请求)
rc = getZKDatabase().processTxn(hdr, txn);
} else {
// 读操作(⾮事务请求)
rc = new ProcessTxnResult();
}
if (opCode == OpCode.createSession) {
if (hdr != null && txn instanceof CreateSessionTxn) {
CreateSessionTxn cst = (CreateSessionTxn) txn;
sessionTracker.addGlobalSession(sessionId, cst.getTimeOut());
} else if (request != null && request.isLocalSession()) {
request.request.rewind();
int timeout = request.request.getInt();
request.request.rewind();
sessionTracker.addSession(request.sessionId, timeout);
} else {
LOG.warn("*****>>>>> Got " + txn.getClass() + " " + txn.toString());
}
} else if (opCode == OpCode.closeSession) {
sessionTracker.removeSession(sessionId);
}
return rc;
}
(2) 请求响应
// 1. 对于写操作(事务请求)根据 processTxn() 的结果来获取响应数据
case OpCode.create: {
lastOp = "CREA";
rsp = new CreateResponse(rc.path);
err = Code.get(rc.err);
break;
}
// 2. 对于读操作(⾮事务请求)从内存数据库中获取响应数据
case OpCode.getData: {
lastOp = "GETD";
GetDataRequest getDataRequest = new GetDataRequest();
ByteBufferInputStream.byteBuffer2Record(request.request,
getDataRequest);
DataNode n = zks.getZKDatabase().getNode(getDataRequest.getPath());
if (n == null) {
throw new KeeperException.NoNodeException();
}
Long aclL;
synchronized(n) {
aclL = n.acl;
}
PrepRequestProcessor.checkACL(zks, zks.getZKDatabase().convertLong(aclL),
ZooDefs.Perms.READ,
request.authInfo);
Stat stat = new Stat();
// 直接从内存数据库中获取响应数据
byte b[] = zks.getZKDatabase().getData(getDataRequest.getPath(), stat,
getDataRequest.getWatch() ? cnxn : null);
rsp = new GetDataResponse(b, stat);
break;
}
参考：
1. 《Zookeeper请求处理》：
2. 《【Zookeeper】源码分析之请求处理链（⼆）之PrepRequestProcessor》：
3. 《【Zookeeper】源码分析之请求处理链（三）之SyncRequestProcessor》：
4. 《【Zookeeper】源码分析之请求处理链（四）之FinalRequestProcessor》：
5.
每天⽤⼼记录⼀点点。

内容也许不重要，但习惯很重要！。