ticktrace
// cookbook · sha256

SHA256 (M5-L)

RP2350 ships a hardware SHA-256 engine. Push 64-byte message blocks into a write-only FIFO; pull a 256-bit digest out at the end. The hardware does not pad; that's the firmware's job.

Driver: src/sha256.S. Defs: include/sha256.inc. Base: 0x400f8000. RESETS bit: 17. No clock setup needed beyond the M2 default.

API

sha256_init                                 deassert RESETS, configure CSR
sha256_start                                begin a new hash (CSR.START pulse)
sha256_write_word(r0=w)                     push one 32-bit word
sha256_write_block(r0=src_64bytes)          push 16 words from memory
sha256_get_digest(r0=dst_32bytes)           spin SUM_VLD, copy 8 words
sha256_compute(r0=msg, r1=len_b, r2=dst)    pad + push + read, all in one

sha256_compute is the only function most callers ever need.

Quick start

    bl      sha256_init
    ldr     r0, =message
    movs    r1, #message_len
    ldr     r2, =digest_buf
    bl      sha256_compute
    @ digest_buf now holds 32 bytes of SHA-256(message)

How padding works

SHA-256 requires every message be padded to a multiple of 64 bytes, ending with the 64-bit big-endian length-in-bits. sha256_compute builds the final block (or two blocks, if the residue + 0x80 doesn't fit in the first 56 bytes) on the stack. Worth knowing if you ever want to skip sha256_compute and feed the engine via DMA: you still have to push the padded final block by hand.

CSR bitfield

Bit Name Notes
0 START Pulse-set; self-clears.
1 WDATA_READY 1 = FIFO has space for another word.
2 SUM_VLD 1 = digest valid in SUM0..SUM7.
3:4 DMA_SIZE 0=byte, 1=halfword, 2=word. We use word.
5 ERR_WDATA_NOT_RDY Caller wrote WDATA when not ready. Latches.
6 BSWAP 1 = byte-swap input. We enable it so little-endian CPU stores feed the big-endian-natural engine correctly.

Throughput

Engine consumes one 64-byte block every ~64 cycles at clk_sys = 150 MHz, so ~150 MB/s peak. The CPU push loop adds a few cycles per word; DMA-fed streaming hits ~140 MB/s. About 10× faster than software SHA-256 on M33.

Build artefacts

  • build/sha256_demo.uf2: hashes "abc" and prints the result over UART (expected: ba7816bf8f01cfea414140de5dae2223b00361a3396177a9cb410ff61f20015a).

T1 tests

tests/unicorn/test_sha256.py (4 cases):

  • sha256_init writes RESETS bit 17 and configures CSR.
  • sha256_write_word spins on WDATA_READY then stores once.
  • sha256_get_digest spins on SUM_VLD then reads 8 words.
  • sha256_compute("abc") emits exactly the right padded-block word stream.