Tiny ELF binaries on ARM

Looks like breadbox didn't want to go down this rabbithole. So we'll have to do it instead.

  • e_machine seems to be the only checked header field (e_entry alignment checks are normal, because if it wouldn't be aligned, the code would segfault on entry.)
  • phdr parsing etc. is done architecture-independent, so the same tricks should be usable here as well.
    • However, on x86, we were using the way page mapping works to only have to specify a few flags, this probably can't be ported over.
  • Each ARM opcode is 4 bytes long, and needs to be aligned. This kinda sucks for all the overlapping tricks. Also, arbitrary constants can't be loaded into registers easily.
    • Do we want to depend on Thumb-mode?
  • Apparently the kernel doesn't look at the immediate field of swi instructions if it's configured as EABI-only.
  • Dynamic linking stuff

A few questions on the target platform

  • How many ARM Linux machines have thumb, half and fastmult? EABI or OABI?
    • Seems to be common enough. OABI is dead, everyone's on EABI now.
  • Which CPU are we targetting? ARMv6T (RPI1)? ARMv5TE?

Minimal ELF Poc

Not that minimal :) (But it should be able to show you which fields can be bogus quite clearly.)

gcc -c -o tiny.o tiny.S
ld -nostdlib -nostartfiles -T tiny.ld -o tiny.elf tiny.o
objcopy -O binary tiny.elf tiny.bin # somehow ld --oformat=binary no worky?

(Of course, the toolchain should be arm-linux-gnueabi or sth.)

@ tiny.S
.arch armv5te ; @.cpu arm946e-s

.section .ehdr,"awx",%progbits
.align 4
.arm

#define ORG 0x08048000

ehdr:
        e_ident:
                .byte 0x7F; .ascii "ELF"
                .byte 'p' @ ELFCLASS32
                .byte 'c' @ ELFDATA2LSB
                .byte 'y' @ EV_CURRENT
                .byte '/' @ EI_OSABI_SYSV
                          @ EI_ABIVERSION
                .ascii "K2^TiTAN" @.byte 0,0,0,0,0,0,0 @ EI_PAD
        e_type: .2byte 2 @ ET_EXEC
        e_machine: .2byte 40 @ EM_ARM
        e_version: .ascii "gree"
        e_entry: .4byte _start
        e_phoff: .4byte phdr - ehdr
        e_shoff: .ascii "tsto"
        e_flags:                                  
                @.4byte 0x2|0x4|0x40|0x80|0x00400000|0x05000000
                        @ 2: hasentry
                        @ 4: thumb interwork
                        @ 40: 8-bit struct alignment
                        @ 80: EABI
                        @ 00400000: little-endian AAPCS
                        @ 05000000: EABI v5
                .ascii "brea" @ 0x05000000 @ EABIv5, no float stuff
        e_ehsize: .2byte e__end - ehdr
        e_phentsize: .2byte p__end - phdr
        e_phnum: .2byte 1
        e_shentsize: .ascii "dbox&&"
        @e_shnum:
        @e_shstrndx:
        e__end:
phdr:
        p_type: .4byte 1 @ PT_LOAD
        p_offset: .4byte 0
        p_vaddr: .4byte ORG
        p_paddr: .ascii "all@"
        p_filesz: .4byte _start__end - _start + e__end - ehdr + p__end - phdr
        p_memsz: .4byte _start__end - _start + e__end - ehdr + p__end - phdr
        p_flags: .byte 5 ; ascii "IRC" @ R=4 W=2 X=1
        p_align: .ascii "#lsc" @ 0x1000
        p__end:

.global _start
_start:
        mov r7, #1 @ SYS_exit
        mov r0, #42
        swi #31337 @ literal can be nonsense
        @bkpt #1337 @ like x86 int3 @ literal can be nonsense
_start__end:
/* tiny.ld */
OUTPUT_FORMAT("elf32-littlearm","elf32-bigarm","elf32-littlearm")
OUTPUT_ARCH(arm)
ENTRY(_start)

SECTIONS {
        . = 0x08048000;

        .ehdr : {
                *(.ehdr*)
        }
}
00000000  7f 45 4c 46 70 63 79 2f  4b 32 5e 54 69 54 41 4e  |.ELFpcy/K2^TiTAN|
00000010  02 00 28 00 67 72 65 65  54 80 04 08 34 00 00 00  |..(.greeT...4...|
00000020  74 73 74 6f 62 72 65 61  34 00 20 00 01 00 64 62  |tstobrea4. ...db|
00000030  6f 78 26 26 01 00 00 00  00 00 00 00 00 80 04 08  |ox&&............|
00000040  61 6c 6c 40 60 00 00 00  60 00 00 00 05 49 52 43  |all@`...`....IRC|
00000050  23 6c 73 63 01 70 a0 e3  2a 00 a0 e3 69 7a 00 ef  |#lsc.p..*...iz..|
00000060

ARM sizecoding notes

  • All ARM instructions are 4 bytes wide
  • There's a "Thumb" mode, with reduced instructions (eg. no free shifts, no 3-operand instructions) where all instructions are 2 bytes wide
  • Switch to Thumb like this: add lr, pc, #1; bx lr
  • The mov opcode doesn't accept arbitrary immediate values, so you sometimes have to spill values to a "constant pool"
  • Null bytes decode to andeq r0, r0 in ARM mode, or to movs r0, r0 in Thumb mode, both are no-ops, unlike in x86 where null bytes cause a segfault.
    • Instruction encoding is relatively sane, so you can predict what low-value 32-bit ints will decode to so you can treat them as (almost-)no-ops.
  • It's a RISC, so you don't have one-byte-instructions, flexible addressing modes or stringops, but there are a few useful parts in ARM:
    • pc, sp etc. behave as regular registers
    • You can shift one operand of an instruction by a constant value for free, it doesn't cost any bytes. (ARM mode only) This can also be used to do fixed-point multiplications etc. (eg. add r0, r0, lsr #1 for r0*1.5)
    • ldmia/stmia are great for copying stuff around.