Tiny ELF binaries on ARM
Looks like breadbox didn't want to go down this rabbithole. So we'll have to do it instead.
- Target platform: OABI is dead, so EABI it is. Anything running Linux seems to support
thumb
,half
andfastmult
, and usuallyedsp
as well.- So what will the minimum target CPU be? ARMv6T (RPI1)? ARMv5TE?
- All ARM instructions are 4 bytes wide
- There's a "Thumb" mode, with reduced instructions (eg. no free shifts, no 3-operand instructions) where all instructions are 2 bytes wide
- Switch to Thumb like this:
add lr, pc, #1; bx lr
- On older ARMs, it was possible to directly write to
pc
and switch mode, but this doesn't seem to be possible on ARMv6 anymore.
- On older ARMs, it was possible to directly write to
- The
mov
opcode doesn't accept arbitrary immediate values, so you sometimes have to spill values to a "constant pool", or be creative with assignments and register write-backs in load/store ops - Null bytes decode to
andeq r0, r0
in ARM mode, or tomovs r0, r0
in Thumb mode, both are no-ops, unlike in x86 where null bytes cause a segfault.- Instruction encoding is relatively sane, so you can predict what low-value 32-bit ints will decode to so you can treat them as (almost-)no-ops.
- It's a RISC, so you don't have one-byte-instructions, flexible addressing modes or stringops, but there are a few useful parts in ARM:
-
pc
,sp
etc. behave as regular registers - You can shift one operand of an instruction by a constant value for free, it doesn't cost any bytes. (ARM mode only) This can also be used to do fixed-point multiplications etc. (eg.
add r0, r0, lsr #1
forr0*1.5
) -
ldmia
/stmia
are great for copying stuff around
-
-
e_machine
(ande_type
) seem to be the only checked header fields (e_entry
alignment checks are normal, because if it wouldn't be aligned, the code would segfault on entry.)- Of course,
e_entry
,e_phoff
,e_phnum
need to contain the right values, ande_phentsize
ande_ehsize
need to be correct as well.
- Of course,
-
phdr
parsing etc. is done architecture-independently, so the same tricks should be usable here as well.- Turns out it's even more relaxed than x86 when messing with
p_paddr
,p_padding
andp_flags
. It seems to be the case that the kernel & CPU will happily let you execute code in read-write pages.
- Turns out it's even more relaxed than x86 when messing with
- Apparently the kernel doesn't look at the immediate field of
swi
andbkpt
instructions if it's configured as EABI-only (which we assume). - Dynamic linking stuff
Minimal ELF Poc
Not that minimal :) (But it should be able to show you which fields can be bogus quite clearly.)
gcc -c -o tiny.o tiny.S
ld -nostdlib -nostartfiles -T tiny.ld -o tiny.elf tiny.o
objcopy -O binary tiny.elf tiny.bin # somehow ld --oformat=binary no worky?
(Of course, the toolchain should be arm-linux-gnueabi
or sth.)
@ tiny.S
.arch armv5te ; @.cpu arm946e-s
.section .ehdr,"awx",%progbits
.align 4
.arm
#define ORG 0x08048000
ehdr:
e_ident:
.byte 0x7F; .ascii "ELF"
.byte 'p' @ ELFCLASS32
.byte 'c' @ ELFDATA2LSB
.byte 'y' @ EV_CURRENT
.byte '/' @ EI_OSABI_SYSV
@ EI_ABIVERSION
.ascii "K2^TiTAN" @.byte 0,0,0,0,0,0,0 @ EI_PAD
e_type: .2byte 2 @ ET_EXEC
e_machine: .2byte 40 @ EM_ARM
e_version: .ascii "gree"
e_entry: .4byte _start
e_phoff: .4byte phdr - ehdr
e_shoff: .ascii "tsto"
e_flags:
@.4byte 0x2|0x4|0x40|0x80|0x00400000|0x05000000
@ 2: hasentry
@ 4: thumb interwork
@ 40: 8-bit struct alignment
@ 80: EABI
@ 00400000: little-endian AAPCS
@ 05000000: EABI v5
.ascii "brea" @ 0x05000000 @ EABIv5, no float stuff
e_ehsize: .2byte e__end - ehdr
e_phentsize: .2byte p__end - phdr
e_phnum: .2byte 1
e_shentsize: .ascii "dbox&&"
@e_shnum:
@e_shstrndx:
e__end:
phdr:
p_type: .4byte 1 @ PT_LOAD
p_offset: .4byte 0
p_vaddr: .4byte ORG
p_paddr: .ascii "all@"
p_filesz: .4byte _start__end - _start + e__end - ehdr + p__end - phdr
p_memsz: .4byte _start__end - _start + e__end - ehdr + p__end - phdr
p_flags: .byte 5 ; ascii "IRC" @ R=4 W=2 X=1
p_align: .ascii "#lsc" @ 0x1000
p__end:
.global _start
_start:
mov r7, #1 @ SYS_exit
mov r0, #42
swi #31337 @ literal can be nonsense
@bkpt #1337 @ like x86 int3 @ literal can be nonsense
_start__end:
/* tiny.ld */
OUTPUT_FORMAT("elf32-littlearm","elf32-bigarm","elf32-littlearm")
OUTPUT_ARCH(arm)
ENTRY(_start)
SECTIONS {
. = 0x08048000;
.ehdr : {
*(.ehdr*)
}
}
00000000 7f 45 4c 46 70 63 79 2f 4b 32 5e 54 69 54 41 4e |.ELFpcy/K2^TiTAN|
00000010 02 00 28 00 67 72 65 65 54 80 04 08 34 00 00 00 |..(.greeT...4...|
00000020 74 73 74 6f 62 72 65 61 34 00 20 00 01 00 64 62 |tstobrea4. ...db|
00000030 6f 78 26 26 01 00 00 00 00 00 00 00 00 80 04 08 |ox&&............|
00000040 61 6c 6c 40 60 00 00 00 60 00 00 00 05 49 52 43 |all@`...`....IRC|
00000050 23 6c 73 63 01 70 a0 e3 2a 00 a0 e3 69 7a 00 ef |#lsc.p..*...iz..|
00000060