This post is another part of our list of short posts pinpointing specific cases OVHcloud has dealt with, both preparing for and during the migration. Here, we tell a story of how sometimes, a little unexpected behaviour can lead to a bug fix in a software program used by millions of people around the world.
In this post, I explained how we had to push proxySQL past its limits.
Julien explained here how difficult it can be to hijack pieces of a software program, and make it work for a very specific use case. One so specific that we just could not push it upstream, as we were breaking some assumptions on the base code (something like, “look, we don’t use this, maybe we can divert it, to make it match our use”).
But sometimes, as we dig deeper and deeper through lines of code, pushing the limits of open-source software programs further and further, we reach bugs.
Here are two real-life case where we had an unexpected behaviour from ProxySQL, resulting in a patch for MariaDB/MySQL.
1. The importance of quoting
An unexpected behaviour
When we first started using ProxySQL, we had an unexpected behaviour. ProxySQL used to hang. No warning, no precursor. A hang means that ProxySQL cannot process your requests, causing your websites to be unavailable as a result.
Digging
Sysadmins know that when in doubt, check the logs. That’s what we did, and we noticed this:
[ERROR] Detected a broken connection during SET NAMES on 192.168.59.272 , 3306 : 2019, Can't initialize character set (null) (path: compiled_in)
So we took a closer look at the logs, tried to replicate the behaviour, and finally came across this:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'binary' at line 1
This is clearly an error returned from the MySQL server. As this log line was generated on the ProxySQL, it means the error was between our ProxySQL and the MySQL server.
Finding gold
Julien worked hard — he read tonnes of logs, traced many PIDs, and tracked the problem down to THE triggering case — issuing this command:
set names binary COLLATE binary;
A collation is a kind of rule-set for comparing strings. It can define, for example when sorting alphabetically, if A
comes before a
, if é
should be treated as e
or not, or where œ
should be in the alphabet.
You can read more about it on MariaDB’s Knowledge Base.
The fix
After submitting an issue on ProxySQL’s bug-tracker with what we discovered, and proposing a patch, the author of ProxySQL had a look and confirmed the bug.
While writing ProxySQL, René Cannaò did what all developers do — he followed MariaDB’s connector documentation. And the bug was from here:
According to the documentation on SET NAMES syntax (https://dev.mysql.com/doc/refman/5.7/en/set-names.html) : charset_name and collation_name may be quoted or unquoted This doesn't seem true for "SET NAMES binary COLLATE binary" , that requires the collation name to be quoted.
(https://bugs.mysql.com/bug.php?172id=93692)
The butterfly effect
So, from a hang on our infrastructure, we reported a bug to the project’s creator, who then traced it to a bug in a MariaDB connector, followed it up to its parent MySQL, and fixed it upstream. The bug was closed in early 2020.
In the meantime, we worked with Mr Cannaò to provide a workaround (basically forcing the collation name to be quoted in ProxySQL’s code).
2. When wires get crossed
While writing this article, I remembered another funny bug. I decided to check this bug status, and noticed we never reported it. My mistake is now fixed.
An unexpected behaviour
When using some real but exotic charsets on ProxySQL (we also love tricky scenarios, so we try to test them as much as possible), we had a ProxySQL error, stating we were using an incorrect charset.
MySQL_Session.cpp:2964:handler(): [WARNING] Error during query on (42,192.168.59.272,3306): 1267, Illegal mix of collations (utf8_general_ci,IMPLICIT) and (dec8_swedish_ci,COERCIBLE) for operation '='
Digging
Of course, first of all we checked that we were using a real and legit charset and associated collation. We double- and triple-checked that we should be allowed to use `dec8_swedish_ci`.
We decided to have a look at the connector’s source code.
Finding gold
If you are curious about it, you can have a look at older versions of the `libmariadb/my_charset.c` code, starting from line 541. You will notice that `dec8_swedish_ci` is nowhere to be found. But if you look closely, you will notice dec8_swedisch_ci. Dear friends from Sweden, you were not the only ones with a typo.
The fix
When fixing our issue with exotic charsets, we applied a custom patch on our custom build of ProxySQL. In the heat of the moment, we delayed the bug reporting.
We forked the mariadb-corporation / mariadb-connector-c GitHub repository, fixed some typos, and proposed a pull request — which was then merged mid-February.
The butterfly effect
Everyone is prone to typos, and making mistakes. We encountered some, they were fixed and ported for everyone.
Conclusion
As a leading web hosting provider in Europe, we face the Law of truly large numbers (https://en.wikipedia.org/wiki/Law_of_truly_large_numbers). TL;DR: Each and every event with a non null occurring probability will happen.
We host more than 1.3 million websites. Let’s assume that 0.0001% of our customers can trigger one single specific bug — we have at least 1 customer who will trigger it. The question is not if, but when we will have to deal with it, and how to do so.
Wondering how we noticed that we had hit the Law of truly large numbers? Stay tuned, because we will write about this soon.
Sharing is caring
OVHCloud loves open-source, and loves sharing. This is not the first time we contributed to the open-source world, and definitely not the last.
Member of the database team since 2017