Pascal Brisset explains (on erlang-questions some time ago) a scenario where Erlang's selective receive can fall behind...
The system is dimensioned so that the CPU load is low (say 10 %).
Now at some point in time, the backend service takes one second
longer than usual to process one particular request.
You'd expect that some requests will be delayed (by no more than
one second) and that quality of service will return to normal
within two seconds, since there is so much spare capacity.
Instead, the following can happen: During the one second outage,
requests accumulate in the message queue of the server process.
Subsequent gen_server calls take more CPU time than usual because
they have to scan the whole message queue to extract replies.
As a result, more messages accumulate, and so on.
snowball.erl (attached) simulates all this. It slowly increases
the CPU load to 10 %. Then it pauses the backend for one second,
and you can see the load rise to 100 % and remain there, although
the throughput has fallen dramatically.
Here are several ways to avoid this scenario...
...
Add a proxy process dedicated to buffering requests from clients and making sure the message queue of the server remains small. This was suggested to me at the erlounge. It is probably the best solution, but it complicates process naming and supervision. And programmers just shouldn't have to wonder whether each server needs a proxy or not.
I'm not sure how it really complicates naming and supervision so much. I think it is the best solution. The problem is not in selective receive, per se, which has a lot of benefits that outweigh this specific scenario. Especially wrong would be to gum up the Erlang language and simple message passing mechanisms just for this.
The problem in this scenario is *coupling* too closely the asynchronous selective receive with the backend synchronous service. This is not an uncommon scenario in all kinds of "service-oriented architectures" and the solution, generally, should be the one quoted above.
A programmer should legitimately wonder whether some kind of a "proxy" is needed when they see this kind of a combination.
This is related to the blogs going round not so long ago among fuzzy, Bill de hÓra, Dan Creswell, and others.