Lighttpd and luasockets
A few days ago I got nice task. Jo discovered a nice bug while playing around with memcached, lighttpd and Lua.
When lighttpd got a request it can call a Lua script before the fcgi take over the request. The lua script is able to modify the complete request handling. Jo tried to use this abillity to make a cache decision in lua and therefore save the PHP + FCGI overhead by serving the cache if available. As this approach usually makes sense on high traffic sites, people want to use a distributed caching system like memcached. The cache can be run on a complete other machine and requested over network. To interact with memcached Jo used the Memcache.lua bindings, which are quite similar to other language bindings like the PHP bindings. Unfortunatly when calling Memcached.Connect from lighty, it crashed silently. If you tried to call memcached from the standalone lua interpreter, everything worked fine. So what happend?In fact not the memcache bindings caused the problems, but the underlaying luasockets library. I discovered that luasockets fails while initializing the tcp socket and setting up a buffer with the information to send through the socket. First I thought it was a problem with the way lighty handles sockets. Maybe this could confuse luasockets. Apparently it doesnot. The socket gets perfectly allocated by the system and returned. I was pretty confused as I tried to look at the exact point where the execution stopped. The buffer_init method which was called returned a invalid pointer, even the function itself was so clear, that I cannot break the library. Everytime I tried to step into the mysterious buffer_init method of luasocket gdb seems to jump to a wrong line and not to the buffer_init method. Deciding to renew my source directories in gdb actually made my day. Suddently gdb switched into lighty’s buffer_init script. Pretty obvious: lighttpd also has a buffer_init method. When loading the luasocket library as a shared module, it’s buffer_init method address gets overwritten by lighty’s buffer_init. Therefore the luasocket always used light’s function which initialized the buffer in an unexpected way and caused the luasocket to get a null pointer instead of a correct initialized value. As luasocket doesnot check this, it fails when trying to access the pointer and segfaults (together with lighty). So the actual fix is to prefix all of luasockets exported functions as well as their references and recompile.