Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 5334

Bare metal, Assembly language • Re: SIMD LDR from device memory

$
0
0
I've been writing optimized bare metal code for the Raspberry Pi 4 for years without issues. The source code in my case is in Rust, but that shouldn't make a difference. I don't have a Raspberry Pi 3, and since I'm currently writing bare metal code for the 5, that's the one I'm testing this on. I found your conclusion weird, so I wrote some assembly code to make a 32 bit read with an offset to an SIMD register, from an identity mapped hardware register in virtualized device memory that is not 128 bit aligned, to make sure that this is not an architecture oddity, and didn't experience anything unexpected.

Below is the code that I wrote:

Code:

    unsafe {        asm!(            "b .", // Spin waiting for the debugger.            "ldr {reg}, =0x107D001000", // Always-on UART's base address.            "ldr {vreg:s}, [{reg}, #0x18]", // Load value from the flags register.            "ldr {reg:w}, [{reg}, #0x18]", // Load value from the flags register.            reg = out (reg) _,            vreg = out (vreg) _,            options (nomem, nostack, preserves_flags)        );    }
And below is a debug session where I tested it:

Code:

(lldb) process connect connect://0:3333Process 1 stopped* thread #1, stop reason = signal SIGINT    frame #0: 0x000000000008208c->  0x8208c: b      0x8208c    0x82090: ldr    x9, #0x168    0x82094: ldr    s0, [x9, #0x18]    0x82098: ldr    w9, [x9, #0x18]Target 0: (No executable module.) stopped.(lldb) register write pc 0x82090(lldb) print/x *(int *) 0x107D001018(int) 0x00000090(lldb) nextProcess 1 stopped* thread #1, stop reason = instruction step over    frame #0: 0x0000000000082094->  0x82094: ldr    s0, [x9, #0x18]    0x82098: ldr    w9, [x9, #0x18]    0x8209c: adrp   x9, 11    0x820a0: ldrb   w10, [x9, #0x8]Target 0: (No executable module.) stopped.(lldb) register read/x s0      s0 = 0x00000000(lldb) nextProcess 1 stopped* thread #1, stop reason = instruction step over    frame #0: 0x0000000000082098->  0x82098: ldr    w9, [x9, #0x18]    0x8209c: adrp   x9, 11    0x820a0: ldrb   w10, [x9, #0x8]    0x820a4: tbnz   w10, #0x0, 0x820c4Target 0: (No executable module.) stopped.(lldb) register read/x s0      s0 = 0x00000090(lldb) register read w9      w9 = 0x7d001000(lldb) nextProcess 1 stopped* thread #1, stop reason = instruction step over    frame #0: 0x000000000008209c->  0x8209c: adrp   x9, 11    0x820a0: ldrb   w10, [x9, #0x8]    0x820a4: tbnz   w10, #0x0, 0x820c4    0x820a8: mov    w10, #0xaTarget 0: (No executable module.) stopped.(lldb) register read w9      w9 = 0x00000090(lldb) process detachProcess 1 detached
So, as you can see, at least on the Cortex-A76, a 32 bit read from a 128 bit unaligned hardware register with an offset to a SIMD register, works as expected. Also while I didn't show it, as that would require showing the translation table as well, the MAIR byte corresponding to the memory mapped to that address is 0x00, meaning Device-nGnRnE memory.

How are you reading the register values for debugging?

Statistics: Posted by Fridux — Fri May 17, 2024 12:55 am



Viewing all articles
Browse latest Browse all 5334

Trending Articles