Need Help With Pull Request
I’ve recently been working on a pull request for Rake and I’ve run into some problems I don’t know how to solve, so I’m asking for some help.
Here’s the pull request on GitHub.
At my previous job, we used Rake to build our C++ projects. We used Rake’s multitask feature to allow multiple C++ files to be compiled at the same time, which nicely speeds up builds. However, the output from the various tasks often gets jumbled. Given that the tasks are running in parallel, this isn’t surprising, but it is a major pain when an error occurs. In that case, the error message really needs to be on its own line so that the editor or IDE can find it properly when navigating through errors.
Here’s a simplified example that illustrates the interleaved output:
Running rake
with this Rakefile
results in output something like
this:
My initial idea for fixing this problem was to wrap the $stdout
and
$stderr
streams with an object that would acquire and hold a lock
while forwarding any output message to the original stream. That
would essentially treat any single output operation (like a call to
print
or puts
) as an atomic operation. As of this writing, the
code in the pull request implements this solution.
After testing this approach, I realized that it doesn’t work. There are a couple of problems:
-
Using
puts
or other output methods directly (as opposed to something like$stdout.puts
) calls methods onKernel
. Logically, these methods simply forward to$stdout
as if we’d written$stdout.puts
. However, in MRI these methods are implemented directly in C and so the forwarding is done at that level. These output methods ultimately make one or more calls to the low-levelwrite()
function. It is only these low-level calls that go through the wrapper and get synchronized. In the example above where I callprint
with four arguments, the high-level call to print doesn’t go through the wrapper, and so is not treated as an atomic operation. Instead, there are four separate calls towrite
that go through the wrapper. JRuby seems to do something similar here. -
Many Rake tasks use
sh
or similar to run external programs. These programs may write to the standard output streams directly, and those writes won’t go through the wrapper either.
I’m not sure where to go next. Does anyone have any ideas on how to synchronize output to standard streams in a way that still allows the various tasks to run in parallel? I don’t mind if the output from the various tasks is interleaved; I just want each line of output to stand on its own. I’ll settle for having each independent write operation be treated as atomic.
I’d appreciate any ideas or advice on how to proceed with this, or pointers to other solutions that people have come up with in similar contexts.