2024-06-18

透過 IDA 腳本自動破解 ShadowPad 變種的 ScatterBee 混淆手法

透過IDA 腳本自動破解自動破解 SHADOWPAD 變種的 SCATTERBEE 混淆手法

‍

中國駭客組織 APT41（又名 Winnti Group、Amoeba、Wicked Spider 等）自 2010 年起便活躍於全球，攻擊目標遍及歐洲、亞洲、美洲等地，主要使用的攻擊工具除了有特別針對攻擊對象系統的特製惡意程式，也包含了本篇文章主題：ShadowPad 與其 ScatterBee 變種。

APT41 至少於 2017 年開始使用 ShadowPad 進階模組化遠端存取木馬惡意程式，此惡意軟體在 2019 年後，逐漸被許多與中國有關的駭客組織採用，針對航空、能源、金融、電信與教育等多種不同的產業。2017 年 CCleaner 清理軟體遭駭事件、2019 年香港反送中運動期間數所大學遭到攻擊等案例，皆彰顯了 ShadowPad 持續迭代、涉及多重領域的威脅性。

奧義智慧科技資安研究員趙偉捷（oalieno）在此篇文章中針對 ShadowPad Loader 深入分析，並拆解了在野外發現、利用 ScatterBee 混淆手法的變種。ScatterBee 混淆手法的分析散見於多篇文章中，開源協作平台上雖然也有 IDA plugin 模組、但使用在此樣本上卻相對複雜且不順暢，因此我們提供了完整的 IDA 腳本，有助於還原未混淆的 assembly 程式碼。

‍

‍閱讀本文，你可以知道：

掌握 ShadowPad 情資與 ShadowPad Loader 技術分析。
參考 IDA 腳本，自動反混淆 ShadowPad ScatterBee 變種的攻擊手法。

ShadowPad Loader 樣本分析

在 2023 年 8 月，攻擊者從網頁漏洞打進受害機器，並在端點中放置了三個檔案: log.dll, log.dll.dat, DRM.exe。是本篇文章主要分析的樣本。

DRM.exe 的原始檔名是 BDReinit.exe，是由 BitDefender 簽章的合法執行檔。攻擊者利用這隻合法且有簽章的執行檔進行 DLL Sideloading，來載入惡意的 log.dll。

‍

‍

DLL Sideloading

這支 log.dll 是一個 Loader，它會利用有簽章的合法程式 DRM.exe 來載入自己，讀取並執行同資料夾底下的第二階段的 Shellcode log.dll.dat。

‍

Self modified code

‍

‍

首先，從 DllMain 進到 sub_10001010 函式（圖1），sub_10001010 函式會檢查 Sleep 和 CreateMutexW 的函式開頭是不是 0xE8 或 0xE9，以確認是否有 API Hooking 的行為。API Hooking 常被使用在 Sandbox 或 EDR 產品上，透過修改 Windows API 的前幾個 bytes 來劫持 API ，就能監控程式中使用 Windows API 的行為。

接著，sub_10001010函式會用 GetModuleHandleW 找出載入 log.dll 的 exe 主程式在記憶體中的位址，並檢查偏移 0x2777 的位址是不是等於 0x840FC33B，這代表 log.dll 要綁定使用 DRM.exe (md5: 8a8db1e20dc508af5a81fc00b1929468) 載入才能跑起來。

檢查完後，它會用 VirutalProtect 把該段 code 位址改為可寫（該段位址是 .text 的程式碼區段，原本的權限是 RX），直接把 LoadLibraryW("log.dll") 下的 assembly 改成 call sub_10001000（圖2–2 的 log.dll 是被載入到 0x74330000）。最後再利用 VirtualProtect 修改權限，所以當sub_10001010函式回溯到 DRM.exe 程式時，就會呼叫 call sub_10001000 ，並跳到 log.dll 的 sub_10001000 函式。

‍

修改前的LoadLibraryW("log.dll") — 圖2–1 LoadLibraryW("log.dll") 修改前

‍

修改後的 LoadLibraryW("log.dll") — 圖2–2 LoadLibraryW("log.dll") 修改後

‍

ScatterBee 混淆手法

呼叫 sub_10001000 函式後，後續的程式碼都被 ScatterBee（由英國 PwC 資安威脅研究團隊命名）此手法進一步混淆。ScatterBee 先打散每一個 assembly 指令，再用一個特殊的 jmp function 串起來，類似於以下的程式碼。

push ebp
jmp B

A:
mov ebp, esp
jmp C

B:
sub esp, 0x10
jmp A

C:
...

‍

‍

如圖3所示，這個特殊的 jmp function 就是 sub_10006374，此函式真正要跳轉的位址則是 0x1000A181 + 0xFFFFFDAA = 0x10009F2B，sub_10006374函式會執行以下的 assembly（已隱去原本包含很多 mov eax, eax, xchg ax, ax, jp + jnp, …等等的冗餘代碼）。

xchg ecx, [esp]
pushf
add ecx, [ecx]
popf
xchg ecx, [esp]
ret

‍

Custom decode function

‍

ShadowPad Loader 自定義解碼函式程式碼 — 圖4 ShadowPad Loader 自定義解碼函式

‍

ShadowPad Loader 有自定義的解碼函式，Input 的前四個 bytes 是 key，其他是被加密的字串。key 每次會進行 17 * key - 0x443246ba 的運算（在其他樣本中也有不同的計算方式，像是 8 * key + 0x107E666D），產生的 xor key 則是把 key 的四個 bytes 相加的結果，整個過程會像是以下的 python 腳本：

def decode(key: bytes, enc: bytes):
    key = int.from_bytes(key, 'little')
    dec = b''

    i = 0
    for i in range(len(enc)):
        key = (17 * key - 0x443246ba) & 0xffffffff
        dec += bytes([enc[i] ^ (sum(key.to_bytes(4, 'big')) & 0xff)])

    return dec

‍

Load payload

除此之外，ShadowPad Loader 也會讀取 log.dll.dat 檔案，內含被加密的 shellcode，讀取後會立刻將檔案刪除，並將加密的 shellcode 儲存至 HKCU:\SOFTWARE\Classes\WOW6432Node\CLSID\{a44eee15-f652-fccc-801fdd3405aef4f8} 此位置的 D1EBF8C1 鍵值。儲存位置是寫死的，但我們發現這與 Elastic Security Labs 資安研究團隊於 2023 年 2 月發布的報告不同，猜測是每次行動都會替換的值。

‍

顯示加密的 shellcode 檔案被儲存的位置 — 圖5 加密的 shellcode 被儲存位置

‍

調查時如果無法還原 log.dll.dat 檔案，建議嘗試以下的 powershell 指令，從 registry 讀取 shellcode 進行分析：

-join ((Get-ItemProperty -Path "HKCU:\SOFTWARE\Classes\WOW6432Node\CLSID\{a44eee15-f652-fccc-801fdd3405aef4f8}").D1EBF8C1 | ForEach-Object { $_.ToString("X2") }) > payload.txt

使用 IDA 腳本破解 ScatterBee 混淆手法

拆解 ScatterBee 混淆邏輯後，我們試圖寫出一個 IDA 腳本以重組並重新修補 assembly。

重組被分散的 assembly 所需步驟如下：

從 0x1000A17C 以 DFS 取出原始的指令，儲存成一張有向圖 - 遇到 call 和 jxx (jb, jl, …) 指令的時候要處理 - 遇到 ret 結束
分配新的位址 - 遇到連續的指令要一起分配
針對 call 和 jxx 重新分配位址
重新進行修補

原始的 assembly 中包含了很多 cmp esp xxxx + jb 組合的冗餘代碼，如何判斷哪些指令是冗餘代碼呢？例如在 Windows 與 Linux 系統中 Stack 是倒著長的，而且通常 Stack Allocate 會 Align 0x10000，所以 esp 的後兩個 bytes 會從 0xFFFF 開始往回長，在沒有用到太多 Stack 的情況下通常都是 0xF???。在 DFS 遍歷指令時，我們發現以下這兩行指令把 esp compare 一個小於 0xF000 的隨機值，導致 jb 的觸發條件永遠不會成立，由此確認這就是可以直接忽略的冗餘代碼。

cmp esp 0x1234
jb 0x1000abcd ; never jump

最後我們實作出完整的程式碼為：

import idc
import ida_bytes

def X(x):
    return f"0x{x:08x}"

class DeObfus:
    def __init__(self, magic_function_addr, first_avaliable_addr):
        self.insts = {}
        self.magic_function_addr = magic_function_addr
        self.obf_flag = False
        # There are DllMain and a initialize function in front of FIRST_ADDR
        self.cursor = first_avaliable_addr

    @staticmethod
    def create_insn_force(addr):
        idc.del_items(idc.get_item_head(addr))
        if idc.create_insn(addr) > 0:
            return True
        for i in range(1, 6):
            idc.del_items(addr + i, DELIT_SIMPLE)
            if idc.create_insn(addr) > 0:
                return True
        return False

    def handle_inst(self, addr, prev):
        if prev:
            self.insts[prev]["next_direct"] = addr
        if self.insts.get(addr) is not None:
            return None
        inst = {
            "addr": addr,
            "size": get_item_size(addr),
            "bytes": idc.get_bytes(addr, get_item_size(addr)),
            "op0": idc.get_operand_value(addr, 0),
            "mnem": idc.print_insn_mnem(addr),
            "disasm": idc.generate_disasm_line(addr, 0),
            "is_function_head": False,
            "next_direct": None,
            "next_branch": None
        }
        self.insts[addr] = inst
        if inst["mnem"] == 'call':
            # call eax with operand value not an addr
            # call to .data section might be shellcode
            if 0x1001000 <= inst["op0"] < 0x10016000:
                self.trace(inst["op0"], prev_branch=addr, is_function_head=True)
        elif inst["mnem"][0] == 'j':
            if 0x1001000 <= inst["op0"] < 0x10016000:
                self.trace(inst["op0"], prev_branch=addr, is_function_head=False)
        if 'ret' in inst["mnem"] or inst["mnem"] == 'jmp':
            return None
        return addr + inst["size"]

    def trace(self, start, prev_branch=None, is_function_head=True):
        addr, prev = start, None
        while addr:
            # convert target area to code in IDA
            if not self.create_insn_force(addr):
                raise ValueError(f"[!] Create instruction fail at {X(addr)} (start: {X(start)})")
            size = get_item_size(addr)
            bytes_ = idc.get_bytes(addr, size)
            op0 = idc.get_operand_value(addr, 0)
            mnem = idc.print_insn_mnem(addr)
            # magic jump
            if mnem == 'call' and op0 == self.magic_function_addr:
                offset = int.from_bytes(idc.get_bytes(addr + 5, 4), 'little')
                addr = (addr + 5 + offset) & 0xffffffff
                continue
            # skip cmp esp, xxx + jb obfuscation combination
            # cmp esp, 0x???????? (\x81\xfc\x??\x??\x??\x??)
            # cmp esp, 0x?? (\x83\xfc\x??)
            if bytes_[:2] == b'\x81\xfc' or bytes_[:2] == b'\x83\xfc':
                self.obf_flag = True
                addr = addr + size
                continue
            if self.obf_flag and mnem == 'jb':
                self.obf_flag = False
                addr = addr + size
                continue
            # can't skip jmp
            # there will be two insts have the same next_direct
            #if mnem == 'jmp':
            #    addr = op0
            #    continue
            addr_next = self.handle_inst(addr, prev)
            # first inst
            if prev is None:
                self.insts[addr]["is_function_head"] = is_function_head
                if prev_branch:
                    self.insts[prev_branch]["next_branch"] = addr
            addr, prev = addr_next, addr

    def get_addr(self, size):
        addr = self.cursor
        self.cursor += size
        return addr

    def allocate(self, addr):
        branches = []
        # this stream has been allocated
        if self.insts[addr].get("new_addr") is not None:
            return
        while addr:
            inst = self.insts[addr]
            if inst.get("new_addr") is not None:
                raise ValueError(f"{X(addr)} ({inst['disasm']}) from two next_direct !??")
            inst["new_addr"] = self.get_addr(inst["size"])
            if inst["next_branch"]:
                branches.append(inst["next_branch"])
            addr = inst["next_direct"]
        # allocate local branches together
        for addr in branches:
            inst = self.insts[addr]
            if not self.insts[inst["head"]]["is_function_head"]:
                self.allocate(inst["head"])

    def patch(self):
        patch_bytes = {}
        for addr, inst in self.insts.items():
            inst["is_child"] = False
        for addr, inst in self.insts.items():
            if inst["next_direct"]:
                self.insts[inst["next_direct"]]["is_child"] = True
        # chain the stream
        for addr, inst in self.insts.items():
            if inst["is_child"]:
                continue
            head = addr
            while addr:
                self.insts[addr]["head"] = head
                addr = self.insts[addr]["next_direct"]
        # all have head
        for addr, inst in self.insts.items():
            if inst.get("head") is None:
                raise ValueError(f"{X(addr)} no head")
        # allocate
        for addr, inst in self.insts.items():
            if inst["is_function_head"]:
                self.allocate(addr)
                print(f"[+] Create function at {X(inst['new_addr'])}")
                # align 0x10
                align = 0x10 - self.cursor % 0x10
                patch_bytes[self.cursor] = b'\xcc' * align
                self.cursor += align
        # check all instructions have been allocated
        for addr, inst in self.insts.items():
            if inst.get("new_addr") is None:
                raise ValueError(f"[!] {X(addr)} is not allocated")
        # relocation
        for addr, inst in self.insts.items():
            if not inst["next_branch"]:
                continue
            target = self.insts[inst["next_branch"]]
            inst["bytes"] = inst["bytes"][:-4] + (
                (0x100000000 +
                    target["new_addr"] - (inst["new_addr"] + inst["size"])
                ) & 0xFFFFFFFF
            ).to_bytes(4, 'little')
        # wipe old code
        for addr, inst in self.insts.items():
            ida_bytes.patch_bytes(addr, b'\x90' * inst["size"])
        # actual patch
        for addr, inst in self.insts.items():
            patch_bytes[inst["new_addr"]] = inst["bytes"]
        for addr in sorted(patch_bytes.keys()):
            ida_bytes.patch_bytes(addr, patch_bytes[addr])
            self.create_insn_force(addr)

    def run(self, address):
        self.trace(address)
        self.patch()

使用此 IDA 腳本時，使用者需根據樣本不同，自行指定以下三個參數的位址：

magic_function_addr：這是圖 3 提及的特殊 jmp function，在程式碼中不斷出現。
first_avaliable_addr：這是指定 patch 的位址，建議接在 DllMain 後面，因為原本被混淆的 assembly 已經被還原，可以直接覆蓋。
執行第一個特殊 jmp function 的位址。

de = DeObfus(magic_function_addr=0x10006374, first_avaliable_addr=0x10001180)
de.run(0x1000A17C)

Yara Rules

我們提供的 Yara Rules 為：

rule ShadowPad_Loader_Decode: CyCraft ShadowPad APT {
meta:
    author = "oalieno"
    description = "Custom decode function of ShadowPad Loader"
    severity = 9
    confidence = 9
    sample_hash = "af6d2e58163999e00d57809efe765274"
    malware_family = "ShadowPad"

strings:
    $c1 = {
      8D 8C D1 46 B9 CD BB    // lea     ecx, [ecx+edx*8-443246BAh]
      E8 ?? ?? ?? ??          // call xxx
    }
condition:
    all of them
}

另外，Elastic Seucirty Labs 亦在前述報告中提供過 Yara Rules：

rule Windows_Trojan_ShadowPad_1 {
    meta:
        author = "Elastic Security"
        creation_date = "2023-01-23"
        last_modified = "2023-01-31"
        description = "Target SHADOWPAD obfuscation loader+payload"
        os = "Windows"
        arch = "x86"
        category_type = "Trojan"
        family = "ShadowPad"
        threat_name = "Windows.Trojan.ShadowPad"
        license = "Elastic License v2"
    strings:
        $a1 = { 87 0? 24 0F 8? }
        $a2 = { 9C 0F 8? }
        $a3 = { 03 0? 0F 8? }
        $a4 = { 9D 0F 8? }
        $a5 = { 87 0? 24 0F 8? }
    condition:
        all of them
}

rule Windows_Trojan_Shadowpad_2 {
    meta:
        author = "Elastic Security"
        creation_date = "2023-01-31"
        last_modified = "2023-01-31"
        description = "Target SHADOWPAD loader"
        os = "Windows"
        arch = "x86"
        category_type = "Trojan"
        family = "Shadowpad"
        threat_name = "Windows.Trojan.Shadowpad"
        license = "Elastic License v2"
    strings:
        $a1 = "{%8.8x-%4.4x-%4.4x-%8.8x%8.8x}"
    condition:
        all of them
}

rule Windows_Trojan_Shadowpad_3 {
    meta:
        author = "Elastic Security"
        creation_date = "2023-01-31"
        last_modified = "2023-01-31"
        description = "Target SHADOWPAD payload"
        os = "Windows"
        arch = "x86"
        category_type = "Trojan"
        family = "Shadowpad"
        threat_name = "Windows.Trojan.Shadowpad"
        license = "Elastic License v2"
    strings:
        $a1 = "hH#whH#w" fullword
        $a2 = "Yuv~YuvsYuvhYuv]YuvRYuvGYuv1:tv<Yuvb#tv1Yuv-8tv&Yuv" fullword
        $a3 = "pH#wpH#w" fullword
        $a4 = "HH#wHH#wA" fullword
        $a5 = "xH#wxH#w:$" fullword
        $re1 = /(HTTPS|TCP|UDP):\/\/[^:]+:443/
    condition:
        4 of them
}

‍

總結

ShadowPad 的反鑑識與反分析特性，在初始入侵與長期潛伏階段易於躲避偵測，使其成為供應鏈攻擊事件中的一大利器。今（2024）年初中國安洵信息公司外洩的內部文件中，KELA 威脅情資研究團隊發現 ShadowPad 也赫然在列，甚且包含了 ShadowPad C2 伺服器位址。由此可見，ShadowPad 與其變種惡意軟體不僅為 APT41 所愛用，更是其他中國駭客組織用以肆掠世界各國的攻擊工具。奧義智慧資安研究員撰寫的 IDA 腳本與 ShadowPad Loader 技術分析，在協助緩解此類威脅之餘，也提供了資安社群延伸研究的著力點。

‍

IOC：

log.dll (md5: f4693d792c0edbcc3ed62bf8222a3aca)

log.dll.dat (md5: 60940d341c313eee08dcd7b18154ce0a)

‍

關於 CyCraft

奧義智慧 (CyCraft) 是亞洲領先的 AI 資安科技公司，專注於 AI 自動化威脅曝險管理。其 XCockpit AI 平台整合 XASM (Extended Attack Surface Management) 三大防禦構面：外部曝險預警管理、信任提權最佳化監控，與端點自動化聯防，提供超前、事前、即時的縱深防禦。憑藉其在政府、金融、半導體高科技產業的深厚實績與 Gartner 等機構的高度認可，奧義智慧持續打造亞洲最先進的 AI 資安戰情中心，捍衛企業數位韌性。

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

點擊此按鈕，即表示您同意奧義智慧的隱私權政策，並同意奧義智慧使用您所提供的資訊並寄送資訊給您。您隨時可以取消訂閱。

本網站使用cookie來優化網站功能、分析網站性能以及提供個人化的體驗和廣告。想了解更多Cookies相關資訊請查看隱私權政策

不同意

同意

透過 IDA 腳本自動破解 ShadowPad 變種的 ScatterBee 混淆手法

‍閱讀本文，你可以知道：

ShadowPad Loader 樣本分析

DLL Sideloading

Self modified code

ScatterBee 混淆手法

Custom decode function

‍

Load payload

Yara Rules

總結

IOC：

延伸閱讀

關於 CyCraft

威脅曝險管理平台

解決方案

最新消息

資源中心

關於奧義

投資人專區

透過 IDA 腳本自動破解 ShadowPad 變種的 ScatterBee 混淆手法

‍閱讀本文，你可以知道：

ShadowPad Loader 樣本分析

DLL Sideloading

Self modified code

ScatterBee 混淆手法

Custom decode function

‍

Load payload

Yara Rules

總結

IOC：

延伸閱讀

關於 CyCraft

訂閱奧義智慧電子報

威脅曝險管理平台

解決方案

最新消息

資源中心

關於奧義

投資人專區