Wednesday, October 5, 2016

MQTT performance methodology using MIMIC MQTT Simulator

Performance of small-scale environments never predicts behavior of
large-scale deployments. But, it is too expensive to setup large numbers
of your MQTT sensors to load test your IoT back-office platform, including
MQTT broker and client applications.

With MIMIC MQTT Simulator, it is simple to create large sensor simulations
(you can run up to 100,000 simulated sensors on a single server)
to verify performance.

But how do you guarantee that your performance tools are not impacting
your tests? For example, if you determine that your broker handles a load
generated by your load generator with satisfactory response, are you sure
that the load generator is not slowing down the whole test?

The methodology is to use MIMIC to simulate a large environment with
synthetic background throughput (the "load rig"), then verify your performance
requirements (eg. maximum round-trip delay) either with a small number of
your real-world sensor, or with another MIMIC setup measuring end-to-end
latency (the "measurement rig"). That way you are sure the synthetic load is not
impacting your measurement setup except through the system under test, the broker.






In the screenshot above we are running 10 sensors with an end-to-end
measuring instrumentation, and the end-to-end delay is graphed in the
bottom graph. It shows minimum, average and maximum delay for messages
from those sensors to a subscriber running in the same MIMIC.

From another MIMIC instance, we keep adding a synthetic load onto the
MQTT broker under test, from 0 to 1000 in steps of 100. The upper graph
shows the size of the background load over the 15 minutes of the test.
Each background load sensor publishes at 1 message per second, so the
throughput is the same as the number of sensors. This is trivial to
change in MIMIC to conform to your real-world expectations.

As you can see, the delay is only slightly increasing over time, except
for 2 notable bumps at 600 and 1000 sensors. It is trivial to repeat the
scenario, and verify that indeed there is a reproduceable problem. You
would never know if you did not do the tests.

(This post has been updated by this newer post.)