I’ve recently been working on a pull request for Rake and I’ve run into some problems I don’t know how to solve, so I’m asking for some help.
Here’s the pull request on GitHub.
At my previous job, we used Rake to build our C++ projects. We used Rake’s multitask feature to allow multiple C++ files to be compiled at the same time, which nicely speeds up builds. However, the output from the various tasks often gets jumbled. Given that the tasks are running in parallel, this isn’t surprising, but it is a major pain when an error occurs. In that case, the error message really needs to be on its own line so that the editor or IDE can find it properly when navigating through errors.
Here’s a simplified example that illustrates the interleaved output:
rake with this
Rakefile results in output something like
My initial idea for fixing this problem was to wrap the
$stderr streams with an object that would acquire and hold a lock
while forwarding any output message to the original stream. That
would essentially treat any single output operation (like a call to
puts) as an atomic operation. As of this writing, the
code in the pull request implements this solution.
After testing this approach, I realized that it doesn’t work. There are a couple of problems:
putsor other output methods directly (as opposed to something like
$stdout.puts) calls methods on
Kernel. Logically, these methods simply forward to
$stdoutas if we’d written
$stdout.puts. However, in MRI these methods are implemented directly in C and so the forwarding is done at that level. These output methods ultimately make one or more calls to the low-level
write()function. It is only these low-level calls that go through the wrapper and get synchronized. In the example above where I call
writethat go through the wrapper. JRuby seems to do something similar here.
Many Rake tasks use
shor similar to run external programs. These programs may write to the standard output streams directly, and those writes won’t go through the wrapper either.
I’m not sure where to go next. Does anyone have any ideas on how to synchronize output to standard streams in a way that still allows the various tasks to run in parallel? I don’t mind if the output from the various tasks is interleaved; I just want each line of output to stand on its own. I’ll settle for having each independent write operation be treated as atomic.
I’d appreciate any ideas or advice on how to proceed with this, or pointers to other solutions that people have come up with in similar contexts.