eRPCWifi thread safe?

Hello !!

just one question for @Seeed team…
Is the eRPC library is thread safe ?

i mean, is multiple tasks can call the lib in // ?
ex: 1 task for NTP sync, 1 task for Wifi connexion management, 1 task for some HTTP requests…

thank you a lot !

I will answer myself…

the answer seems to be NO !

i’m currently testing my app with mutex to avoid concurrent use of eRPC library from multiple tasks.
now the app is running fine since 12h…
without mutex, sometimes it takes only few hours to be totaly stuck.

Sorry for the long wait, which program is the problem?

Hello Citric,

No problems for long time waiting ! :wink:

i have made my own application to display some information on screen from my home sensors.
This app run with FreeRTOS which is already used by eRPCWifi/unified

This is the tasks list of my app:

Num Task Prio Stack
8 Supervisi 3 81
3 IDLE 0 129
1 runServer 8 1738
2 runClient 7 2147
6 Connexion 3 886
7 NTP 3 819
4 Tmr Svc 2 29
5 UI 8 610

You can recognize the 2 tasks from eRPC (runServer and runClient)

  • i use the runClient task (arduino loop function) to make 3 consecutives httpclient requests (GET/POST)
  • the NTP task is run once a hour to sync the RTC time with NTP server (code comes from seeedstudio example)
  • the UI task is used to manage LVGL library to refresh the screen
  • Connexion task is used to manage the wifi connexion (check if connected, and reconnect if necessary)
  • Supervision, task is used to check if something is stuck, display tasklist, some running stats and do device reset if something wrong.

so, in my case, 3 tasks accesses the RTL chip through eRPCWifi/HTTPClient/UDPClient.
Without mutex it works some times…some hours,
With mutex (to avoid concurrent usage of the RTL chip), it seems it works more time…

when it stops working, the RTL chip seems to stop to answer… (see my other post here : https://forum.seeedstudio.com/t/rpcwifi-problem-when-wifi-is-lost/262273 )

but it seems, adding mutexes seems to be not sufficient… i got 1 reset this night (maybe a wifi connexion loose…)

So, my original question is still "is the differents libs (eRPCWifi, eRPCUnified, etc…) are thread safe ? (can be called by multiple tasks concurrently)

Hello, I read your post, your project environment is more complicated, our engineers are temporarily taking a break, I will briefly repeat your question and ask our engineers. You mean that when the wio terminal accesses the RTL chip by multiple tasks at the same time, whether there is a mutex or not, it will cause the RTL chip to hang and stop working, right? Can I understand it this way?

in short, you are almost right

  • RTL chip seem to stop responding when wifi is lost and request are in progress
  • eRPC lib for SAMD FW seems to be not thread safe and application layer need to use mutexes to avoid concurrent calls that can lead to stuck situation (don’t know why, where)

My assumpions:

  • Multiple concurrent calls to eRPC lib can fail to a stuck situation
  • Using mutexes at application level (SAMD51 FW) helps a lot to avoid runtime stuck situation… but do not solve issues related to wifi loose…(see below)
  • if wifi is lost (router unplugged for example), RTL chip FW seems to not detect it very well
  • httpClient could led to stuck situation if Wifi is lost and RTL chip FW have not detected it yet (request never returns !!)
  • eRPC lib use deep inside mutexes that could led to infinite hold waiting RTL chip answers

please send also my detailled posts to your engineers… i think this can help them. (i’m myself SW engineer)

i can also make logs with traces, try fix, etc… don’t hesitate to contact me if i can help them !

Hello @Citric ,

just got a failure.
Here is my app traces…

Getting HTTP values
[V][HTTPClient.cpp:236] beginInternal(): url: http://192.168.9.242/admin/divers/ajax/lecture.php
[D][HTTPClient.cpp:277] beginInternal(): host: 192.168.9.242 port: 80 url: /admin/divers/ajax/lecture.php
[D][HTTPClient.cpp:563] sendRequest(): request type: ‘POST’ redirCount: 0

16 Jan 2022, 19:51:39
16 Jan 2022, 19:51:40
16 Jan 2022, 19:51:41
NETWORK: Fail to get mutex
16 Jan 2022, 19:51:42
16 Jan 2022, 19:51:43
16 Jan 2022, 19:51:44
16 Jan 2022, 19:51:45
16 Jan 2022, 19:51:46
NETWORK: Fail to get mutex
16 Jan 2022, 19:51:47
16 Jan 2022, 19:51:48
16 Jan 2022, 19:51:49
[D][WiFiGeneric.cpp:383] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:407] _eventCallback(): Reason: 0 - MAX
Event, Disconnected from WIFI access point
Event, WiFi lost connection. Reason: 0
16 Jan 2022, 19:51:50
*************************************
Num Task Prio Stack
-------------------------------------
8 Supervisi 3 81
3 IDLE 0 129
5 UI 8 613
7 NTP 3 821
4 Tmr Svc 2 29
6 Connexion 3 886
2 runClient 8 2138
1 runServer 8 1738
*************************************
Heap free:14608

After that, HTTPClient never returns and after 2 minutes, my supervisor task reset the device.
In this case, i receives the Wifi event, but sometimes, i didn’t get it !
Nevertheless, HTTPClient is stuck.
If i try to check WIFI is connected using “Wifi.isConnected()” function, it will no returns (same behavior for all Wifi related functions…)

Hope this can help !

Eric.

Understood, I will contact them as soon as possible to follow up on this issue

I consulted with my colleague, and he said that plateform.io is best to run in the Arduino environment. Wio Terminal may conflict with plateform.io in some compilations. He said that he had read it through assembly language decompilation.

By the way, can you post the full code for us to test?

Hello,

ok, i can send you my full code, but not on the forum.
Please send me an email address to send the code to.

In all cases, the code do not work properly due to the fact it needs my home controllers…

Eric.

Hi Citric,

just sent the email. tell me if you got it !

Eric.

Sorry, I just checked my email and it looks like I haven’t received your email

ok, impossible to send you the email… it was rejected each time.
so, i sent it to you in direct message in the forum.

I got it, I will help you test and reply as soon as possible

1 Like

I heard from the engineers that they judged this to happen as follows. The SAM D51 chip in the wio terminal is a single-core MCU that cannot execute tasks in parallel. This cannot be corrected by RTOS, because RTOS can only be responsible for adjusting the distribution of tasks, and cannot solve problems at the hardware level.
He has two suggestions for you to deal with this problem. First, it is recommended that you store the data in the memory card, then release the memory of the wio terminal, then connect to wifi, upload the data, release the memory, and complete these tasks step by step.
Second, replace wio terminal with ESP32.

uh…
i’m a little bit disapointed by your answer…
it is not related to single core cpu or not…:open_mouth:

the goal of an OS like freeRTOS is to handle multiple tasks to run…
but for sure, the application or library need to to care of multiple usage of the same hardware ressource…
that’s the main usage of Mutexes…

What i suspect here, the eRPCWiFi library do not have such mechanism because of common UART ressource usage to communicate with the RTL chip…

so, ok i added mutexes in at application side… that solve THIS problem…

but the other pb, still to investiguate !

Can you check with the engineer team also for this ?

thank again !

OK, I understand, I will let them follow up on this after the Spring Festival.