Jump to content

Recommended Posts

Posted
On 9/6/2025 at 8:43 PM, MC874 said:

@HorridModz,
Definitely, it's getting quite a hassle to manually reversing through IDA. It would help me (And probably others) to make the process faster.

I just noticed your comment about IDA. If your use case is simply to find offsets, this tool does much more than what you're looking for. In terms of the AOB generation, all it does is dumbly check if instructions contains `0x` or `#` (which is not a foolproof system and results in false positives).

IDA supports AOB searches, and surely there's better tools out there that you can use to generate AOBs. For instance, https://guidedhacking.com/threads/aob-signature-maker.8524/ seems promising.

I'm not trying to discourage you from using my tool, I just want to clarify that it's nothing magical.

Posted
Quote

In terms of the AOB generation, all it does is dumbly check if instructions contains `0x` or `#` (which is not a foolproof system and results in false positives).

Hi @HorridModz,
Yeah, I noticed that every instruction containing defined 0x is being replaced. Well it works in the end by limiting the address range.

Posted
20 hours ago, MC874 said:

Hi @HorridModz,
Yeah, I noticed that every instruction containing defined 0x is being replaced. Well it works in the end by limiting the address range.

Yes, of course! But if that's all you want, just copy the few lines of code. In fact, I blogged the creation of the tool and all of the code snippets I used:

import itertools
import binascii
import keystone
import capstone


def remove_whitespace(s: str) -> str:
    return "".join(s.split())


def wraptext(s: str, size: int) -> list[str]:
    # Thanks to https://stackoverflow.com/questions/9475241/split-string-every-nth-character
    return [s[i:i + size] for i in range(0, len(s), size)]


def getbytes(hexstring: str) -> list[str]:
    """
    Splits a hex string into a list of bytes. Convenient function because it accounts for both
    whitespace-separated and un-separated hex strings.
    """
    hexstring = remove_whitespace(hexstring)
    assert len(hexstring) % 2 == 0, "Invalid hex string (odd length)"
    return wraptext(hexstring, 2)


def make_ks(architecture: str) -> keystone.Ks:
    if architecture == "32bit":
        return keystone.Ks(keystone.KS_ARCH_ARM, keystone.KS_MODE_ARM)
    elif architecture == "64bit":
        return keystone.Ks(keystone.KS_ARCH_ARM64, keystone.KS_MODE_LITTLE_ENDIAN)
    else:
        raise ValueError(f"Unrecognized architecture: {architecture}. Only '32bit' and '64bit' are valid strings")


def make_cs(architecture: str) -> capstone.Cs:
    if architecture == "32bit":
        return capstone.Cs(capstone.CS_ARCH_ARM, capstone.CS_MODE_ARM)
    elif architecture == "64bit":
        return capstone.Cs(capstone.CS_ARCH_ARM64, capstone.CS_MODE_LITTLE_ENDIAN)
    else:
        raise ValueError(f"Unrecognized architecture: {architecture}. Only '32bit' and '64bit' are valid strings")


def armtohex(instruction: str, architecture: str) -> str:
    ks = make_ks(architecture)
    convertedhexlist = []
    convertedinstruction = ks.asm(instruction, as_bytes=True)[0]
    return binascii.hexlify(convertedinstruction).decode().upper()


def hextoarm(hexinstruction: str, architecture: str) -> list[str]:
    cs = make_cs(architecture)
    return next(cs.disasm_lite(bytearray.fromhex(hexinstruction), 0x0))[2:]


def generateaobfromarm(armcode: str, architecture: str) -> str:
    # Convert string of code to list of instructions
    instructions = list(itertools.chain(*[split1.split(";") for split1 in armcode.split("\n")]))
    unknownhex = "??" * 4
    hexlist = []
    for instruction in instructions:
        if instruction == "" or instruction.isspace():
            continue
        if "0x" in instruction or "#" in instruction:
            hexlist.append(unknownhex)
        else:
            hexlist.append(armtohex(instruction, architecture))
    # Hexlist is a list of 4 byte sequences, and we want our separator in between every byte, so we do this little
    # maneuver.
    aob = "".join(hexlist)  # Unformatted
    return " ".join(getbytes(aob))

 

Posted
On 9/22/2025 at 10:55 AM, MC874 said:

Hi @HorridModz,
Thank you, I'm new to keystone, so reading your tool help me to understand it. Although the wordpress seems private.

Oops sorry, wrong link. Here you go.

Anyway, I found the keystone / capstone modes for x86 and x86_64. So I'll go update the tool right now.
 

For your needs, this should be a sufficient script, though:

 

OFFSET = "0x970000"
LIB_PATH = r"C:\Users\zachy\Downloads\frida-gadget-17.3.2-android-x86.so"
ARCHITECTURE = "x86"  # OR: "x86_64"


from functools import cache
import itertools
import binascii
import keystone
import capstone


def remove_whitespace(s: str) -> str:
    return "".join(s.split())


def wraptext(s: str, size: int) -> list[str]:
    # Thanks to https://stackoverflow.com/questions/9475241/split-string-every-nth-character
    return [s[i:i + size] for i in range(0, len(s), size)]


def getbytes(hexstring: str) -> list[str]:
    """
    Splits a hex string into a list of bytes. Convenient function because it accounts for both
    whitespace-separated and un-separated hex strings.
    """
    hexstring = remove_whitespace(hexstring)
    assert len(hexstring) % 2 == 0, "Invalid hex string (odd length)"
    return wraptext(hexstring, 2)

@cache
def bytecount(hexstring: str) -> int:
    """
    Counts the number of bytes in a hex string. Very simple function, but improves readability.
    """
    return len(getbytes(hexstring))


@cache
def make_ks(architecture: str) -> keystone.Ks:
    if architecture == "32bit":
        return keystone.Ks(keystone.KS_ARCH_ARM, keystone.KS_MODE_ARM)
    elif architecture == "64bit":
        return keystone.Ks(keystone.KS_ARCH_ARM64, keystone.KS_MODE_LITTLE_ENDIAN)
    elif architecture == "x86":
        return keystone.Ks(keystone.KS_ARCH_X86, keystone.KS_MODE_32)
    elif architecture == "x86_64":
        return keystone.Ks(keystone.KS_ARCH_X86, keystone.KS_MODE_64)
    else:
        raise ValueError(f"Unrecognized architecture: {architecture}. Only '32bit', '64bit', 'x86', and 'x86_64' are "
                         f"valid strings")


@cache
def make_cs(architecture: str) -> capstone.Cs:
    if architecture == "32bit":
        cs = capstone.Cs(capstone.CS_ARCH_ARM, capstone.CS_MODE_ARM)
    elif architecture == "64bit":
        cs = capstone.Cs(capstone.CS_ARCH_ARM64, capstone.CS_MODE_LITTLE_ENDIAN)
    elif architecture == "x86":
        cs = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_32)
    elif architecture == "x86_64":
        cs = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_64)
    else:
        raise ValueError(f"Unrecognized architecture: {architecture}. Only '32bit', '64bit', 'x86', and 'x86_64' are "
                         f"valid strings")
    cs.detail = True
    return cs


def offset_to_hex(offset: str, libfile: str, hexbytes: int = 600, sep: str = " "):
    try:
        decimal_offset = int(offset, 16)
    except ValueError:
        raise ValueError(f"Invalid offset: {offset}. Please provide a hexadecimal value.")
    with open(libfile, "rb") as lib:
        # Read certain number of bytes from offset
        lib.seek(decimal_offset)
        hexstr = lib.read(hexbytes).hex().upper()
        if hexstr == "":
            raise Exception(f"Offset {offset} not found in file {libfile}")
        return sep.join(getbytes(hexstr))


@cache
def armtohex(armcode: str, architecture: str, sep: str = " ", upper: bool = True) -> str:
    ks = make_ks(architecture)
    # Convert string of code to list of instructions (split by newline)
    lines = armcode.split("\n")
    convertedhexlist = []
    for instruction in lines:
        if instruction.isspace():
            continue
        try:
            convertedinstruction = ks.asm(instruction, as_bytes=True)[0]
            convertedhexlist.append(binascii.hexlify(convertedinstruction).decode())
        except Exception:
            raise Exception(f"Failed to assemble ARM opcode: {instruction} with {architecture} "
                              f"architecture. Is the ARM instruction valid? Is the architecture correct?") from None
    convertedhex = sep.join(convertedhexlist)
    if upper:
        convertedhex = convertedhex.upper()
    return convertedhex


@cache
def hextoarm(hexstr: str, architecture: str) -> list[str]:
    if hexstr == "" or hexstr.isspace():
        return []
    cs = make_cs(architecture)
    convertedinstructions = []
    for insn in cs.disasm(bytearray.fromhex(remove_whitespace(hexstr)), 0x0):
        op = f"{insn.mnemonic} {insn.op_str}".strip()
        convertedinstructions.append(op)
    if not convertedinstructions:
        raise Exception(f"Failed to disassemble hex: {hexstr} with {architecture} architecture."
                        f" Check that the hex instruction comes from the right lib file at the "
                        f"right offset, and the architecture is correct.") from None

    return convertedinstructions

def is_relative_instruction(instruction: str, architecture):
    """
    Uses capstone and manual heuristics to check if an asm instruction is dynamic.
    Should work for any architecture!
    """
    cs = make_cs(architecture)
    # This is annoying. We need to assemble the instruction to hex, then disassemble it again to get capstone info.
    cs_insns = tuple(cs.disasm(bytearray.fromhex(remove_whitespace(armtohex(instruction, architecture))), 0x0))
    if len(cs_insns) != 1:
        raise Exception(f"Instruction {instruction} is not one instruction (it is {len(cs_insns)}) with architecture"
                        f" {architecture}")
    cs_insn = cs_insns[0]
    # noinspection IncorrectFormatting
    return ("0x" in instruction or "#" in instruction) or (cs_insn.group(capstone.CS_GRP_CALL) or
            cs_insn.group(capstone.CS_GRP_JUMP) or cs_insn.group(capstone.CS_GRP_BRANCH_RELATIVE))

def generate_aob(hexinstructions: str, architecture: str) -> str:
    # Convert string of code to list of instructions
    wildcard_byte = "??"
    hexlist = []
    for instruction in hextoarm(hexinstructions, architecture):
        instruction_hex = armtohex(instruction, architecture)
        if instruction_hex == "":
            continue
        if is_relative_instruction(instruction, architecture):
            hexlist.append(" ".join([wildcard_byte] * bytecount(instruction_hex)))
        else:
            hexlist.append(instruction_hex)
    # We want our separator in between every byte, so we do this little maneuver.
    aob = "".join(hexlist)  # Unformatted
    return " ".join(getbytes(aob))


hexstring = offset_to_hex(OFFSET, LIB_PATH, hexbytes=600)  # hexbytes = amount of bytes for AOB
print(generate_aob(hexstring, ARCHITECTURE))

x86 turned out to be a huge pain 😅 because it has variable-length opcodes and it is harder to detect dynamic ones. But this should work - let me know if it suits you! If you need the dependencies, you can install the tool's requirements.txt.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.