Non SMP BSP on multicore systems

The current RISC-V BSPs, when built with RTEMS_SMP = False seem to not be able to properly run on multicore systems.

According to the RISC-V spec all CPUs in a system must start executing immediately at reset. In the current BSPs, this mean that a binary would start executing on all CPUs at the same time.

The problem is that there’s no logic in start.S to stop CPUs that are not the boot CPU which mean programs will keep executing until it starts creating troubles between all the different programs.

Is that intended to be the case? Non-SMP BSPs are intended to be run on single core systems only?

To contrast this, SPARC BSPs can run on multicore systems in non-SMP configuration because SPARC cores start in a halted state by default.

That sounds like a bug. Most likely no one tested that case yet.

I think the initial message has an error. The subject line says Non-SMP BSP, but the body says RTEMS_SMP = True when I expect that should be False.

In general, multiple independent instances of a BSP (SMP or non-SMP) can run on a multicore system, but they must be configured with different memory spaces. Otherwise, the non-SMP BSP must detect that it’s on the wrong core and cease executing. Some architectures have this code and some do not. It sounds like it needs to be added to the RISC-V startup.

Good catch, fixed to the proper configuration (RTEMS_SMP = False). Here I am not talking about having two different programs on the same multicore system but simply loading and running a single non-SMP binary to a multicore system. Here is an example on hardware:

rmon4> load build/riscv-rtems7-noel64imafd/rtems-hello.exe; run
                 0 .start              80B              [===============>] 100%
                50 .text            111.8kB / 111.8kB   [===============>] 100%
             1bf50 .rodata           14.4kB /  14.4kB   [===============>] 100%
             1f8e0 .eh_frame            4B              [===============>] 100%
             1f8e8 .tdata              24B              [===============>] 100%
             1f900 .init_array          8B              [===============>] 100%
             1f908 .fini_array          8B              [===============>] 100%
             1f910 .rtemsroset        224B              [===============>] 100%
             1f9f0 .data              8.3kB /   8.3kB   [===============>] 100%
             21b20 .sdata             408B              [===============>] 100%
  Total size: 135.18kB (25.75Mbit/s)
  Entry point 0x00000000
  Image /home/matteo/dev/rcc/src/samples/build/riscv-rtems7-noel64imafd/rtems-hello.exe loaded

  CPU 0:  Interrupted!
          0x000000000000b5cc: 0047f793  andi    a5, a5, 4  <apbuart_outbyte_polled+44>
  CPU 1:  Forced into debug mode
          0x000000000000b5cc: 0047f793  andi    a5, a5, 4  <apbuart_outbyte_polled+44>

grmon4> inst 10 cpu0
  TIME       L  P  ADDRESS           INSTRUCTION                         RESULT              SYMBOL
  172083054  0  M  000000000000dc58  nop                                 [0000000000000000]  _IO_Relax+0x0
  172083054  1  M  000000000000dc5c  nop                                 [0000000000000000]  _IO_Relax+0x4
  172083055  0  M  000000000000dc60  nop                                 [0000000000000000]  _IO_Relax+0x8
  172083055  1  M  000000000000dc64  nop                                 [0000000000000000]  _IO_Relax+0xc
  172083056  0  M  000000000000dc68  nop                                 [0000000000000000]  _IO_Relax+0x10
  172083056  1  M  000000000000dc6c  nop                                 [0000000000000000]  _IO_Relax+0x14
  172083057  0  M  000000000000dc70  nop                                 [0000000000000000]  _IO_Relax+0x18
  172083057  1  M  000000000000dc74  nop                                 [0000000000000000]  _IO_Relax+0x1c
  172083058  1  M  000000000000dc78  ret                                 [000000000000dc7c]  _IO_Relax+0x20
  172083380  0  M  000000000000b5c8  lw      a5, 4(s0)                   [0000000000000000]  apbuart_outbyte_polled+0x28

grmon4> inst 10 cpu1
  TIME       L  P  ADDRESS           INSTRUCTION                         RESULT              SYMBOL
  172083219  0  M  000000000000dc58  nop                                 [0000000000000000]  _IO_Relax+0x0
  172083219  1  M  000000000000dc5c  nop                                 [0000000000000000]  _IO_Relax+0x4
  172083220  0  M  000000000000dc60  nop                                 [0000000000000000]  _IO_Relax+0x8
  172083220  1  M  000000000000dc64  nop                                 [0000000000000000]  _IO_Relax+0xc
  172083221  0  M  000000000000dc68  nop                                 [0000000000000000]  _IO_Relax+0x10
  172083221  1  M  000000000000dc6c  nop                                 [0000000000000000]  _IO_Relax+0x14
  172083222  0  M  000000000000dc70  nop                                 [0000000000000000]  _IO_Relax+0x18
  172083222  1  M  000000000000dc74  nop                                 [0000000000000000]  _IO_Relax+0x1c
  172083223  1  M  000000000000dc78  ret                                 [000000000000dc7c]  _IO_Relax+0x20
  172083545  0  M  000000000000b5c8  lw      a5, 4(s0)                   [0000000000000000]  apbuart_outbyte_polled+0x28

So yes, you’re correct in that the non-SMP startup code needs to detect whether it’s on the correct CPU (not even necessarily CPU0) and either continue or halt operation.

The correct CPU would be RISCV_BOOT_HARTID correct?

If yes we can simply add at the beginning of start.S

#ifndef RTEMS_SMP
	li t3, RISCV_BOOT_HARTID
	csrr	 t0, mhartid
	bne	 t0, t3, .Lwfi
#endif

Just tested on hardware and the hello world sample executes properly :grinning:

That seems notionally correct, but I’m not a RISC-V expert, yet. Do you feel like putting this up in a MR since it works for you?

Made one here: bsps/riscv: Put processor to sleep if not boot processor (!1014) · Merge requests · RTEMS / RTOS / RTEMS · GitLab

I think this is describing (and fixing) the behavior of riscv start.S for a single core on a multicore device (#4735) · Issues · RTEMS / RTOS / RTEMS · GitLab
I have commented on the MR.

You are correct, I should have searched the issues beforehand :slight_smile: