DXU隐性故障导致上行PCU掉包数高
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
故障案例DXU隐性故障导致上行PCU掉包数高
省公司专业无线设备类型宏蜂窝
设备厂家爱立信设备型号RBS2202 软件版本
编制时间2009-3-1 作者作者电话
入库时间审核人审核人电话
厂商审核人联系方式
关键字DXU、丢包、传输、RP、PCU
故障现象
从2月尾开始,DGM19B1出现上行丢包数高的现象。
告警信息
无。
原因分析
一般造成PCU丢包数高的原因主要会有几个方面:1、RP故障2、传输故障、3、基站硬件故障。
DGM19B1网元在2月份开始出现严重的上行PCU丢包数高的现象,如下图:
首先,用OPS脚本追出网元内小区的丢包数情况。
从追踪结果来看高丢包数的小区不是集中在某一块RP,所以可以排除RP板故障。
通过追踪得出小区麻四(G46MSI1):CELLIND号:MAC_06D的上行丢包数很多。
于是尝试对其上行丢包数进行清0,隔几分钟观察一下它的增长情况。
正常情况下,应该增长很慢或不增长,异常情况下则增长很快。
用RLGRP和RLDEP得出他的RP和MAC_***代码。
过程如下:
<TERDI:RP=115;
DEBUGGER PATH OPENED RP的入口指令
CONNECTION SUCCESSFUL
USE '?' OR 'HELP' FOR RP/EMRP SYSTEM MONITOR
AND 'RDSHELP' FOR RP/EMRP SYSTEM DEBUGGER HELP
USE ESCAPE CHARACTER '<' OR '/'(SEQUENCE)
TO GET LOWER CASE LETTERS
Console - monitor connection ready
OSmon>
Connecting to the main target at ose_ldm/ ...
OSmon> :DISPLAY PROCELL MP_MAC_***
Error: Unrecognized command
OSmon> :DISPLAY PROCESS MP_MAC_***
Process Name Identity Type Pri State
------------ -------- ---- --- -----
MP_MAC_08F 393659 prio 21 receive
MP_MAC_03E 4718913 prio 21 receive
MP_MAC_028 ******* prio 21 receive
MP_MAC_018 3801466 prio 21 receive
MP_MAC_01D 983472 prio 21 receive
MP_MAC_06D 2752860 prio 21 receive
MP_MAC_002 1900909 prio 21 receive
*** Display PROCESS end ***
OSmon> :APT GETCELLSTAT MP_MAC_06D
Running getcellstat...
Cell activation time: 2009-02-20 12:03:52
Statistics since: 2009-02-20 12:03:52
'mac' Round Trip Delay = 3 'mac' Round Trip Delay in FN = 13
'mac' RTT Guardtime = 12
'mac' Max Round Trip Delay in FN = 108
'mac' Number of Active LPDCHs = 16
'mac' Number of Active LPSETs = 3
'mac' Number of Active Uplink TBFs = 6
'mac' Number of Active Downlink TBFs = 10
'mac' Discarded Paging Messages per thousand = 0
'mac' Number of scheduled polling requested by con = 818202
'mac' Number of unsuccessful polling requested by con = 147453
'mac' Number of scheduled polling requested by rlc = 12339066
'mac' Number of unsuccessful polling requested by rlc = 632348
'mac' Number of successful FN requests from con = 335776
'mac' Number of unsuccessful FN requests from con = 0
'mac' Number of received GSL error signals on active PDCH = 5774251
'mac' Number of received GSL error signals on not active PDCH = 157747
'mac' Number of uplink messages with C or E bit error = 561840
'mac' Number of CRC error on Air = 3480091 'mac' Number of CRC error on GSL = 395086 'mac' Unexpected position reached in DSP code = 0
'mac' Number of unexpected frames uplink = 821339 'mac' Number of uplink frame sync error = 118526
'mac' Number of RX buffer overrun = 0
'mac' Number of RX autobuffer congestion = 0
'mac' Number of downlink block error = 14
'mac' Number of TX autobuffer underrun = 0
'mac' Number of RLC/MAC blocks with invalid txProtocolBuff data = 3410
'mac' Number of CCU data frames with invalid coding scheme = 143278
'mac' Number of received blocks with four invalid Access bursts = 108777
'mac' Number of lost synchronisation against CCU = 5653
'mac' Number of lost DL2 sychronisation with GSS = 0
'mac' Number of downlink frame sync error = 294007
'mac' Number of RB not scheduled on PDCHs due to FN sync supervision = 7156
'mac' Number of RB not scheduled on MPDCH due to FN sync supervision = 0
'mac' Number of blocked RB used by higher prioritized block = 0
'mac' Number of discarded downlink blocks (ack DISCDL) = 3421
'mac' Number of discarded uplink blocks (ack DISCUL) = 1100190
O Smon> :APT CLEARCELLSTAT MP_MAC_06D(进行清0)
Running clearcellstat...
'mac' Local counters are reset.
Counters for Cell/Mac process MP_MAC_06D cleared!
OSmon> :APT GETCELLSTAT MP_MAC_06D
清完后实时查看其上行丢包数,1小时内如果数量较高一般小区就存在异常。
清完后不久用指令实时查看小区的上行丢包数,如下:
OSmon> :APT GETCELLSTAT MP_MAC_06D
Running getcellstat...
Cell activation time: 2009-02-20 12:03:52
Statistics since: 2009-02-25 14:54:23
'mac' Round Trip Delay = 4
'mac' Round Trip Delay in FN = 17
'mac' RTT Guardtime = 12
'mac' Max Round Trip Delay in FN = 108
'mac' Number of Active LPDCHs = 11
'mac' Number of Active LPSETs = 2
'mac' Number of Active Uplink TBFs = 2
'mac' Number of Active Downlink TBFs = 6
'mac' Discarded Paging Messages per thousand = 0
'mac' Number of scheduled polling requested by con = 305
'mac' Number of unsuccessful polling requested by con = 39
'mac' Number of scheduled polling requested by rlc = 3839
'mac' Number of unsuccessful polling requested by rlc = 324
'mac' Number of successful FN requests from con = 150
'mac' Number of unsuccessful FN requests from con = 0
'mac' Number of received GSL error signals on active PDCH = 3349
'mac' Number of received GSL error signals on not active PDCH = 195
'mac' Number of uplink messages with C or E bit error = 329
'mac' Number of CRC error on Air = 2402
'mac' Number of CRC error on GSL = 45
'mac' Unexpected position reached in DSP code = 0
'mac' Number of unexpected frames uplink = 525
'mac' Number of uplink frame sync error = 72
'mac' Number of RX buffer overrun = 0
'mac' Number of RX autobuffer congestion = 0
'mac' Number of downlink block error = 0
'mac' Number of TX autobuffer underrun = 0
'mac' Number of RLC/MAC blocks with invalid txProtocolBuff data = 2
'mac' Number of CCU data frames with invalid coding scheme = 0
'mac' Number of received blocks with four invalid Access bursts = 22
'mac' Number of lost synchronisation against CCU = 6
'mac' Number of lost DL2 sychronisation with GSS = 0
'mac' Number of downlink frame sync error = 141
'mac' Number of RB not scheduled on PDCHs due to FN sync supervision = 5
'mac' Number of RB not scheduled on MPDCH due to FN sync supervision = 0
'mac' Number of blocked RB used by higher prioritized block = 0
'mac' Number of discarded downlink blocks (ack DISCDL) = 2
'mac' Number of discarded uplink blocks (ack DISCUL) = 374
由上面可以看出清0后,小区在不到1分钟内立即上升到374次。
说明此小区G46MSI1:麻四1有严重上行丢包数。
查看小区的话务和数据业务指标均正常,传输没有误码,没有明显基站故障。
此时怀疑是由于传输头接触不好或DXU故障导致,于是派单给基站检查其传输接头及更换DXU。
26号10点更换DXU后,DGM19B1的行丢包数大大减少,小区丢包数基本没有。
DGM19B1网元丢包数保持在1000次左右/小时,处理后指标如下:
至此,PCU上行丢包数的问题得到彻底解决,并确定故障原因是由于DXU隐性故障导致。
故障总结
造成PCU丢包数的原因主要会有几个方面:RP故障、传输故障、其它设备故障导致。
由于爱立信BSC话统只能提供PCU的上、下行丢包数统计,因此我们首先可通过追踪RP的事件记录,确认是否RP级故障或小区级故障。
若是整个RP下所带小区普遍异常时更换RP处理;若是单某个小区存在问题,那么就得认真排查小区传输、载波或其它设备故障可能。
本案例中,我们看到传输质量并无异常,话音指标无明显异常,所以怀疑是DXU故障导致,因为DXU在小区的运行过程起控制管理作用,面向BSC接口、管理小区所有其它硬件:
在日常的网优工作中,我们也经常能看到DXU隐性故障导致的指标异常案例,这里暂不讨论。