A compiler bug

Written 2023-09-29

Tags:c compiler 

Around 10 years ago, I found a bug in a vendor's compiler.

The Driver

Quite often we needed to read an asynchronous hardware timer. Effectively like hooking a 32-bit parallel timer to a 32-bit input port, just entirely inside the chip.

Because the hardware timer isn't synchronous to the MCU clock, there's a very slim chance that reading the timer will occur just as the timer is incrementing, showing a mix of both old and new bits as the timer increment ripples through. This is a type of hardware race condition.

The hardware vendor said it would be enough to repeatedly read the timer until it returned the same value twice, then that was a safe value to use.

My first pass:

//pointer to memory-mapped hardware register
static const volatile uint32_t * async_timer = (volatile uint32_t*)0x4003D000;

uint32_t read_timer_simple()
{
    uint32_t x,y;

    do
    {
        x = *async_timer;
        y = *async_timer;
    } while(x!=y);
    return x;
}

This had a slight downside that on the rare mismatch, it would require reading the timer port an even number of times, and the timer port wasn't the fastest to read, so I came up with the following which takes only one additional timer read on mismatch:


uint32_t read_timer()
{
    uint32_t x;
    uint32_t y = *async_timer;

    while(1)
    {
        x = y;
        y = *async_timer;
        if(x==y) return x;
    }
}

The good assembly

I no longer have access to the original, so I've run this through GCC 13.2.0, -O2, which roughly matches:

read_timer():
        ldr     r2, .L10 //Load address of hardware register into r2
        ldr     r0, [r2] //load value of timer into r0
.loop:
        mov     r3, r0
        ldr     r0, [r2] //load next value of timer into r0
        cmp     r3, r0   //compare previous and next value of timer
        bne     .loop    //if not equal, go to .L8 to read timer again
        bx      lr       //return value in r0
.L10:
        .word   0x4003D000

The compiler upgrade

To support a new microchip with a newer CPU core, we upgraded from proprietary compiler 3.something to proprietary compiler 4.whatnot to gain support for some new instructions. This went generally well, except that every few hours to days, a device would lock up, and the user would restart the device, losing any chance to catch what was going on. Eventually, we were able to catch a live one and hook it up to a debugger.

The bug

read_timer():
        ldr     r2, .L10 //Load address of hardware register into r2
        ldr     r0, [r2] //load value of timer into r0
        mov     r3, r0
        ldr     r0, [r2] //load next value of timer into r0
.loop:
        cmp     r3, r0   //compare previous and next value of timer
        bne     .loop    //if not equal, go to .L8 to read timer again
        bx      lr       //return value in r0
.L10:
        .word   0x4003D000

Did you see it? The instructions are almost the same, but the branch target moved slightly. In the common case where the bus race condition does not occur, the timer is read twice, it matches and returns. I think the compiler optimizes assuming that the volatile read may return the same value, even though it does read it more than once. Effectively, the compiler implemented the following:

uint32_t read_timer_bugged()
{
    uint32_t x = *async_timer;
    uint32_t y = *async_timer;

    if(x!=y)while(1);
    return x;
}

...which of course, locks up the machine whenever the hardware race occurs.

Closing

Once we understood the issue, it was simple enough to reproduce, and the compiler vendor took care of it promptly in a paid upgrade to proprietary compiler 5.next