Changes in 4ac3033: Updated tinyelf-arm (markdown)

				explain/tinyelf-arm.md
			
          @@ -2,21 +2,25 @@

           Looks like breadbox didn't want to go down *this* rabbithole. So we'll have to do it instead.

          +* Target platform: OABI is dead, so EABI it is. Anything running Linux seems to support `thumb`, `half` and `fastmult`, and usually `edsp` as well.

          +  * So what will the minimum target CPU be? ARMv6T (RPI1)? ARMv5TE?

          +* All ARM instructions are 4 bytes wide

          +* There's a "Thumb" mode, with reduced instructions (eg. no free shifts, no 3-operand instructions) where all instructions are 2 bytes wide

          +* Switch to Thumb like this: `add lr, pc, #1; bx lr`

          +  * On older ARMs, it was possible to directly write to `pc` and switch mode, but this doesn't seem to be possible on ARMv6 anymore.

          +* The `mov` opcode doesn't accept arbitrary immediate values, so you sometimes have to spill values to a "constant pool"

          +* Null bytes decode to `andeq r0, r0` in ARM mode, or to `movs r0, r0` in Thumb mode, both are no-ops, unlike in x86 where null bytes cause a segfault.

          +  * Instruction encoding is relatively sane, so you can predict what low-value 32-bit ints will decode to so you can treat them as (almost-)no-ops.

          +* It's a RISC, so you don't have one-byte-instructions, flexible addressing modes or stringops, but there are a few useful parts in ARM:

          +  * `pc`, `sp` etc. behave as regular registers

          +  * You can shift one operand of an instruction by a constant value for free, it doesn't cost any bytes. (ARM mode only) This can also be used to do fixed-point multiplications etc. (eg. `add r0, r0, lsr #1` for `r0*1.5`)

          +  * `ldmia`/`stmia` are great for copying stuff around

           * [`e_machine` seems to be the only checked header field](https://code.woboq.org/linux/linux/arch/arm/kernel/elf.c.html) (`e_entry` alignment checks are normal, because if it wouldn't be aligned, the code would segfault on entry.)

           * `phdr` parsing etc. is done architecture-independent, so the same tricks should be usable here as well.

          -  * However, on x86, we were using the way page mapping works to only have to specify a few flags, this probably can't be ported over.

          -* Each ARM opcode is 4 bytes long, and needs to be aligned. This kinda sucks for all the overlapping tricks. Also, arbitrary constants can't be loaded into registers easily.

          -  * Do we want to depend on Thumb-mode?

          -* Apparently the kernel doesn't look at the immediate field of `swi` instructions __if it's configured as EABI-only__.

          +  * Turns out it's even more relaxed than x86 when messing with `p_paddr`, `p_padding` and `p_flags`. It seems to be the case that the kernel & CPU will happily let you execute code in read-write pages.

          +* Apparently the kernel doesn't look at the immediate field of `swi` instructions __if it's configured as EABI-only__ (which we assume).

           * [Dynamic linking stuff](https://linux.weeaboo.software/explain/rtld#dynamic-linking_arm)

          -### A few questions on the target platform

          -

          -* How many ARM Linux machines have `thumb`, `half` and `fastmult`? EABI or OABI?

          -  * Seems to be common enough. OABI is dead, everyone's on EABI now.

          -* Which CPU are we targetting? ARMv6T (RPI1)? ARMv5TE?

          -

          -

           ### Minimal ELF Poc

           Not *that* minimal :) (But it should be able to show you which fields can be bogus quite clearly.)

          @@ -114,16 +118,3 @@ SECTIONS {

           00000050  23 6c 73 63 01 70 a0 e3  2a 00 a0 e3 69 7a 00 ef  |#lsc.p..*...iz..|

           00000060

           ```

          -

          -## ARM sizecoding notes

          -

          -* All ARM instructions are 4 bytes wide

          -* There's a "Thumb" mode, with reduced instructions (eg. no free shifts, no 3-operand instructions) where all instructions are 2 bytes wide

          -* Switch to Thumb like this: `add lr, pc, #1; bx lr`

          -* The `mov` opcode doesn't accept arbitrary immediate values, so you sometimes have to spill values to a "constant pool"

          -* Null bytes decode to `andeq r0, r0` in ARM mode, or to `movs r0, r0` in Thumb mode, both are no-ops, unlike in x86 where null bytes cause a segfault.

          -  * Instruction encoding is relatively sane, so you can predict what low-value 32-bit ints will decode to so you can treat them as (almost-)no-ops.

          -* It's a RISC, so you don't have one-byte-instructions, flexible addressing modes or stringops, but there are a few useful parts in ARM:

          -  * `pc`, `sp` etc. behave as regular registers

          -  * You can shift one operand of an instruction by a constant value for free, it doesn't cost any bytes. (ARM mode only) This can also be used to do fixed-point multiplications etc. (eg. `add r0, r0, lsr #1` for `r0*1.5`)

          -  * `ldmia`/`stmia` are great for copying stuff around.

          \ No newline at end of file

...	...	@@ -2,21 +2,25 @@
2	2
3	3	Looks like breadbox didn't want to go down this rabbithole. So we'll have to do it instead.
4	4
	5	+* Target platform: OABI is dead, so EABI it is. Anything running Linux seems to support `thumb`, `half` and `fastmult`, and usually `edsp` as well.
	6	+ * So what will the minimum target CPU be? ARMv6T (RPI1)? ARMv5TE?
	7	+* All ARM instructions are 4 bytes wide
	8	+* There's a "Thumb" mode, with reduced instructions (eg. no free shifts, no 3-operand instructions) where all instructions are 2 bytes wide
	9	+* Switch to Thumb like this: `add lr, pc, #1; bx lr`
	10	+ * On older ARMs, it was possible to directly write to `pc` and switch mode, but this doesn't seem to be possible on ARMv6 anymore.
	11	+* The `mov` opcode doesn't accept arbitrary immediate values, so you sometimes have to spill values to a "constant pool"
	12	+* Null bytes decode to `andeq r0, r0` in ARM mode, or to `movs r0, r0` in Thumb mode, both are no-ops, unlike in x86 where null bytes cause a segfault.
	13	+ * Instruction encoding is relatively sane, so you can predict what low-value 32-bit ints will decode to so you can treat them as (almost-)no-ops.
	14	+* It's a RISC, so you don't have one-byte-instructions, flexible addressing modes or stringops, but there are a few useful parts in ARM:
	15	+ * `pc`, `sp` etc. behave as regular registers
	16	+ * You can shift one operand of an instruction by a constant value for free, it doesn't cost any bytes. (ARM mode only) This can also be used to do fixed-point multiplications etc. (eg. `add r0, r0, lsr #1` for `r0*1.5`)
	17	+ * `ldmia`/`stmia` are great for copying stuff around
5	18	* [`e_machine` seems to be the only checked header field](https://code.woboq.org/linux/linux/arch/arm/kernel/elf.c.html) (`e_entry` alignment checks are normal, because if it wouldn't be aligned, the code would segfault on entry.)
6	19	* `phdr` parsing etc. is done architecture-independent, so the same tricks should be usable here as well.
7		- * However, on x86, we were using the way page mapping works to only have to specify a few flags, this probably can't be ported over.
8		-* Each ARM opcode is 4 bytes long, and needs to be aligned. This kinda sucks for all the overlapping tricks. Also, arbitrary constants can't be loaded into registers easily.
9		- * Do we want to depend on Thumb-mode?
10		-* Apparently the kernel doesn't look at the immediate field of `swi` instructions __if it's configured as EABI-only__.
	20	+ * Turns out it's even more relaxed than x86 when messing with `p_paddr`, `p_padding` and `p_flags`. It seems to be the case that the kernel & CPU will happily let you execute code in read-write pages.
	21	+* Apparently the kernel doesn't look at the immediate field of `swi` instructions __if it's configured as EABI-only__ (which we assume).
11	22	* [Dynamic linking stuff](https://linux.weeaboo.software/explain/rtld#dynamic-linking_arm)
12	23
13		-### A few questions on the target platform
14		-
15		-* How many ARM Linux machines have `thumb`, `half` and `fastmult`? EABI or OABI?
16		- * Seems to be common enough. OABI is dead, everyone's on EABI now.
17		-* Which CPU are we targetting? ARMv6T (RPI1)? ARMv5TE?
18		-
19		-
20	24	### Minimal ELF Poc
21	25
22	26	Not that minimal :) (But it should be able to show you which fields can be bogus quite clearly.)
...	...	@@ -114,16 +118,3 @@ SECTIONS {
114	118	00000050 23 6c 73 63 01 70 a0 e3 2a 00 a0 e3 69 7a 00 ef \|#lsc.p..*...iz..\|
115	119	00000060
116	120	```
117		-
118		-## ARM sizecoding notes
119		-
120		-* All ARM instructions are 4 bytes wide
121		-* There's a "Thumb" mode, with reduced instructions (eg. no free shifts, no 3-operand instructions) where all instructions are 2 bytes wide
122		-* Switch to Thumb like this: `add lr, pc, #1; bx lr`
123		-* The `mov` opcode doesn't accept arbitrary immediate values, so you sometimes have to spill values to a "constant pool"
124		-* Null bytes decode to `andeq r0, r0` in ARM mode, or to `movs r0, r0` in Thumb mode, both are no-ops, unlike in x86 where null bytes cause a segfault.
125		- * Instruction encoding is relatively sane, so you can predict what low-value 32-bit ints will decode to so you can treat them as (almost-)no-ops.
126		-* It's a RISC, so you don't have one-byte-instructions, flexible addressing modes or stringops, but there are a few useful parts in ARM:
127		- * `pc`, `sp` etc. behave as regular registers
128		- * You can shift one operand of an instruction by a constant value for free, it doesn't cost any bytes. (ARM mode only) This can also be used to do fixed-point multiplications etc. (eg. `add r0, r0, lsr #1` for `r0*1.5`)
129		- * `ldmia`/`stmia` are great for copying stuff around.
...	...	\ No newline at end of file