My Retain Tip/X Series

EXT QPI LINK 오류 해결 리테인팁

엔지니어-FIXER 2012. 4. 20. 14:34

"EXT QPI LINK" events and possible operating system hang - IBM System x3850 X5 (7143, 7191)

 

 

위 장비에서 발생되는 문제점

 

Sensor "Ext QPI Link 1" has transitioned from normal to non-critical state
Sensor "Ext QPI Link 2" has transitioned from normal to non-critical state

 

위 메시지나

 

(Slot/Connector - ): Assertion: Transition to Critical from less severe.

  Transition to Critical from less severe

 

IPMI에서 이런메시지 나올때 패치

 

 

OS에서 발견되는 메시지들

 

 

 

  • Microsoft Windows Server 2008 Release 2 (WS08R2) with Microsoft Cluster Services (MSCS) running:
  •   WHEA Event ID 19 A corrected hardware error occurred. Processor core error source:1
    Error Type:10 Processor ID:130
    (Processor IDs may vary)

    Note: Hangs have not been observed on WS08R2 without MSCS running.

    VMware ESX 4.1

      cpu60:4156)MCE: 1363: MCE on cpu60 bank1:
    Status:0x9800004000020e0f Misc:0x2 Addr:0x0: Valid.Err enabled.Misc valid.
    cpu60:4156)MCE: 1367: Status bits: "Bus and Interconnect: OtherTrans Bus Generic error."

    VMware ESX 5.0

      cpu61:8253)MCE: 1278: CMCI on cpu61 bank1: Status:0x9800004000020e0f Misc:0x2 Addr:0x0: Valid.Err enabled.Misc valid.
    cpu61:8253)MCE: 1282 Status bits: "Bus and Interconnect: OtherTrans Bus Generic error."

     

    요런거 나오면 바로 적용

     

    UEFI를 2012년 2쿼터에 발표한다고 하네요 요걸 적용해야하는데

    안나왔다면 아래 방법으로

     

     

    Workaround 1

    Revert the server to third quarter 2011 UEFI, IMM, and Field Programmable Gate Array (FPGA) code levels as shown below. With these levels, the error may be ignored with minimal impact to performance.

      UEFI v1.71a g0e171a
    IMM v1.30 yuooc7e
    FPGA v2.01 g0ud72b

    Workaround 2

    After applying Workaround 1, change the QPI link speed to Power Efficiency from Maximum Performance as follows:

    F1 Setup --> System Settings --> Operating Modes --> QPI Link Frequency --> Power Efficiency

     

    두가지 방법이 있다고 하니 요대로 적용

     

     

    원본링크 ㅣ http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5089038