jump to navigation

Ruby on Windows: A Note for Microsoft
Thursday, 21 September 2006

Posted by austin in: Ruby, comments closed

Good evening. I have finally sat down to try to pull together the issues related to building Ruby with Visual Studio 8. I apologize for the lengthy delay in producing this so that the conversation can continue on this matter.

As I said back in June, the primary concern that most Ruby developers have is the difficulty in compiling extensions on Windows, especially those developers who have extensive background with UNIX-like environments, where you can easily do:

./configure && make && make test && sudo make install

or

ruby extconf.rb && make && make install

or

gem install <name-of-extension-gem>

Visual Studio 2005 (VS8) does not make this easier because of the choice to require manifests for the unmanaged assemblies with SxS installation. Much of this can be automated, but some common tools are completely missing from the VS8 tool chain that make building harder than it needs to be. To be fair, there are tools missing from the Ruby environment that make building extensions harder than it needs to be.

Ruby needs tools similar to the Python distutils, but this is something that the Ruby community must provide, not Microsoft. So far, though, no one has stepped forward to create such a beast, although tools like mkrf from this year’s Google Summer of Code could be used to improve this situation considerably.

This does not mean that Microsoft has nothing that it can do to help here. If Microsoft wishes to win the Ruby community toward using VS8 instead of continuing to use VC6 or stepping sideways toward the MinGW/MSYS combination, it needs to offer positive steps toward increased compatibility even between Microsoft’s tools and offer some way of making it easier to build tools that are primarily developed on UNIX-like platforms for Windows.

Let me be clear: I consider using MinGW/MSYS a step backwards (not sideways) for the Ruby community to take, because it goes to a known-buggy runtime (VC6’s MSVCRT.DLL) and will have a harder time interfacing with Microsoft’s modern technologies moving forward. Not everyone considers this a problem; they need Windows versions of tools that they have been using on UNIX-like environments more than they need the advanced capabilities offered by the later runtimes. This is a compelling argument, but I do not find it wholly convincing.

There is a vocal contingent of Ruby developers that would like to be able to have a version of the Ruby One-Click Installer that includes a compiler as part of its distribution, possibly as a separate download. This is something that would matter significantly for people who use Ryan Davis’s RubyInline or ZenOptimize, and other development environments. Again, I don’t find this compelling (and would prefer not having such a thing as a single distribution package because it would be unnecessarily large). It is a significant concept, though, because the one thing that the Microsoft tool chain has over MinGW/MSYS is that it is easier to install and get running, even though it’s significantly harder to build extensions with.

What Do We Need From Microsoft?

This question is the source of considerable debate and anger. There is a level of mistrust of Microsoft in the Ruby community (especially the Japanese community) based on missteps by the Microsoft developer community with regards to backwards compatibility. For this, I’m going to quote Usa Nakamura, who builds the Ruby binaries that Curt uses for the Ruby One-Click Installer for 1.8.4 and 1.8.5 with some clarification by myself and URABE Shyouhei. The original messages were provided in response to a post that I made on the 26th of June, 2006 if they are needed.

The problem of errno is not serious, I think. We will be able to avoid the problem with some simple code (for example, replace it with a function call by macro.) The real problems [arise from separate resource management per runtime DLL version, keeping independent file descriptor tables, memory management, etc., that are internally maintained and not open through any API].

The decision [by Microsoft to break] binary compatibility between versions of runtime […] is foolish. [Did they think] that passing file descriptors between DLLs [would not happen]? As time goes by, there will be some [need] to introduce such incompatibility, I know. […] VC7, VC7.1, and VC8 were shipped [in rapid succession], and they are mutually incompatible[…]. It’s crazy, to say the least.

I decided to [stay] with VC6 for the above-mentioned reasons. If MS keeps shipping incompatible versions at each upgrade, I will throw away VC and shift to MinGW. If MinGW also follows [MS’s binary incompatibility], I’ll shift to Cygwin or throw away Windows as development environment.

[I] hope [that] Microsoft [could provide a] wrapper DLL [for] MSVCRT80.DLL named MSVCRT.DLL. If MS prepares such a mechanism, [there will be] binary compatibility[…]. If so, [it wouldn’t matter whethera program links against MSVCRT80.DLL or MSVCRT.DLL].

It’s bad enough where we have the source but prefer using pre-compiled binaries either because of time, trustworthiness, or compilation complexity. Ara Howard notes that compiling the GNU Scientific Library (GSL) on Windows is so complex without Cygwin or MinGW that there is a company that charges $600 for a compiled version of the source. Others have noted similar issues with other code.

It’s worse when we have third-party binary DLLs compiled with an earlier or later version of Microsoft tools. Ruby is receiving a lot of attention now because of Rails, and I suspect that Microsoft would love to have people using SQL Server as their database underneath Rails, but if there are problems compiling the binary extensions to link against the SQL Server interface drivers, this is less likely to happen or be common. Oracle drivers present a similar problem. We don’t have the source to recompile with the tool that we prefer, and the binary DLL we must use may have been compiled with an earlier incompatible version of the compiler.

(Let me digress for a moment and make it clear that this is an overall problem on all platforms, not just Windows; I have experienced it most often on Linux in my day job where I do C/C++ development. It is far more apparent on Windows, however, because it is unusual for there to be more than one C/C++ runtime on a UNIX-like system. Not impossible, but unusual.)

Thus, the first thing that we feel that would be beneficial from Microsoft would be some sort of runtime shim or wrapper that would allow us to use programs built with VC6, VC7, VC7.1, VC8 or even MinGW compatibly.

The second thing that we feel that would be beneficial from Microsoft would be a better command-line tool chain that is preferably compatible with many of the UNIX-style build commands. A minimal start for this would be command-wrappers that allow you to use a gcc/g++ front-end that actually calls VS8’s cl.exe with the appropriate command-line parameters. Microsoft’s move away from command-line support and toward devenv as the primary build environment helps most Windows developers, but hurts those who do their primary development on platforms other than Windows.

The third thing reaches into core functionality problems that Nobu Nakada posted about in late July:

Charlie Savage had some interesting things to say in a lengthy discussion in July while I was in Europe with limited Internet access:

From my experience using both tool chains on Windows (for the ruby-prof extension and SWIG-based extensions for GEOS and GDAL).

  • You can build Ruby extensions using MingW that run against Ruby built with VC++. I’ve done this with Ruby 1.8.2/1.8.4, various MingW releases and VC++ 2003 and VC++ 2005. This used to require changing a small bug in ruby.h for Ruby 1.8.2, but that bug has been fixed with 1.8.4.[…]
  • However, you cannot do this with MingW using VC++ built Ruby.

    ruby extconf.rb
    make
    make install

    The problem is that extconf is quite limited – it will assume you are building your extension with the same compiler that built Ruby (VC++). Python avoids this issue because disutils will recognize the compiler being used (MingW, VC++) for the extension and provide the correct command line parameters.

  • If mkrf can work like Python distutils, then it will become simple to use MingW to build extensions that work with VC++
  • When compiling with MingW do not link against the ruby *.lib files. Instead, just link directly against the DLL (msvcrt-ruby18.dll). It’s faster (links much faster) and works better.
  • So you need to manually compile your extension or create a makefile to do it. This actually turns out be the way GEOS and GDAL work – they have autoconf based build systems so extconf.rb wouldn’t fit in anyway.
  • The advantage of MingW is that it avoids the unmanaged assemblies that VC++8 uses, so its simpler to deal with[…]
  • VC++ has several large advantages on Windows.
    • First, it lets you debug your extensions while GDB does not support this on Windows (or if it does, its never worked for me).
    • Second, it compiles much faster
    • Third, there is a lot more help available.
    • [Fourth], its quite easy to build Ruby extensions.
  • Using MingW on Windows is a huge barrier to entry. Gettting MingW setup, along with msys, is a time consuming process that only experienced *Nix developers will understand and be able to do.
  • MingW on Windows is not very easy to use. It’s nice to think that you can download an open source project, type ./configure, make, make install and it will work. Alas, it doesn’t really work that way. There are myriad of issues you run into. First you’ll need msys. Then many projects have prequisites that you’ll have to download and compile. In addition, you often times have to change the CFLAGS and LDFLAGS to get successful compiles. Linking is a pain and requires hand-holding, and sometime just doesn’t work. Libtool is really flakey on Windows. For some projects, you’ll have to need to download/build/install the latest version of it. You also need to get autoconf/automake installed. Many projects require bison – something I’ve never been able to successfully compile on Windows. All in all – it literally took me weeks to figure out how to get everything to work together. The MingW/msys tool chain is quite complex on Windows, and most people won’t have the time or desire to put forth the effort to get it to work.

My recommendation:

  • Use VC++ 2005 and get Microsoft to tell us how to properly use unmanaged assemblies so that we can avoid dll hell
  • Make sure that mkrf supports building Ruby extensions “out-of-the-box” on Windows using MingW if you have it installed. I think this would be the best of both worlds – you support both tool chains. VC++ is the default one, but MingW should work fine for building extensions.

Hope this helps – I’d be glad to share more of my experiences if its helpful.

So, what can we do to move Ruby toward a highly usable environment that is based on modern Microsoft compiler technology yet remains backwards compatibility with the tools that for many reasons we cannot give up?

A Brief Note

Posted by austin in: HaloStatue, comments closed

Some folks complained that they were unable to comment on my blog yesterday. I have purposely disabled anonymous comments at all times on my blog. This means that I will get fewer comments, perhaps, but I don’t even want to start fighting comment spam if at all possible. As such, I am requiring user registration at least until I can figure out how to integrate OpenID or some other federated identifier with my blog.

Ola Bini’s Ducktator and Controversial Topics

Posted by austin in: Ruby, comments closed

UPDATE September 21: Yesterday, I got a little annoyed at Ola Bini and something he said. He was surprised by the reaction he got to his use of a term which is very common in Ruby parlance, but has overloaded meanings. At some point in the mailing-list discussion, he stepped away and made what some (myself included) felt was a sideswiping response at the participants of the on-list discussion. We have resolved the conflict and clarified our positions toward each other, and rather than pretending that the whole episode didn’t happen, he and I have agreed to edit our respective blog posts so that they are far more explanatory and less inflammatory overall. Ola certainly didn’t mean any harm by his post, and I believe him. As such, the title of this post has changed. I’m sure that for a while at least, you’ll be able to find the originals in the Google Cache, or maybe the Wayback machine, but understand this: what is below and on Ola’s updated post reflects how things are.

After releasing Ducktator, Ola Bini was surprised at the responses that were received toward his use of the term “duck typing.” The discussion went on for a while, and Ola eventually decided to make a post to his blog about it. I think that the library is cutely named (as others on the ruby-talk mailing list, I like puns) and several people have suggested that it’s going to be very useful for them. However, I don’t think that Ducktator is about duck typing, and its name is unfortunate in that sense, because it has the chance of confusing the discussion surrounding duck typing even further.

What is Duck Typing?

The simplistic answer to this, of course is, “if it walks like a duck, and
it quacks like a duck, then it must be a duck.” This is certainly true, but the
real question is more “how do I know if I’m duck typing?” Many people, myself
included, consider duck typing a matter of trusting one’s callers and
documenting your API well enough to ensure that the callers can trust that you
won’t do something they’re not expecting. The canonical example of this is a
logger.

class SimpleLogger
  def initialize(recipient = nil)
	@out = recipient || $stderr
  end

  def log(message)
	@out &lt;&lt; "#{message}\n"
  end

  attr_reader :out
end

l = SimpleLogger.new([])
l.log "Hello"

SimpleLogger simply trusts its users to give it recipient classes that respond to #&lt;&lt;. Now, this is a simple case, but what about something a little more complex? Well, Text::Format can accept a hyphenator object, which must implement a particular method that has a particular arity. So when you assign a hyphenator object, Text::Format does object signature validation for both the presence of the #hyphenate method and its arity (two, I believe). I consider Text::Format’s approach less duck-typed (possibly even not duck-typed, although it is more dynamic than class-based validation) than I consider the SimpleLogger class above. Both are useful techniques, but to Text::Format is less flexible than SimpleLogger.

That’s OK, though. The increased complexity of the API for hyphenation suggests that validation is not a bad thing. When you get into the realms of problems that Ola wrote Ducktator for, you need even more complex — and arguably less flexible — validation. As I said yesterday:

As Eric Mahurin noted in a spin-off post from the main thread about Ducktator, there’s a number of possible definitions for duck typing, so it’s understandable that people find themselves confused. I’m not Dave Thomas (he and Andy applied the term first to Ruby, as far as I can tell), but as far as I’m concerned, object signature validation is not duck typing. It never has been, and it never will be. It’s object signature validation.

I don’t care whether you do object signature validation by the class of the object (in which case you’d usually be unnecessarily restricting yourself in Ruby) or by the actual method or methods you need, you’re still not doing duck typing.

I still believe this to be true. By all means, use Ducktator if it’s going to help you. It isn’t what I consider to be duck-typing, though.

More on Ruby Performance
Saturday, 2 September 2006

Posted by austin in: Ruby, comments closed

After yesterday's post, Vidar of edgeio (a classified advertising site) posted to ruby-talk his experience with using Ruby in a high-performance environment. He previously had a simple messaging middleware that he had written in C++ and replaced it with a 700-line Ruby middleware that had more features that took less time to write. (Granted, some of the less time to write is because the second time you implement something, you know what you’re doing and know what shortcuts you can take.) The interesting bits from his article, though, suggest he’s done some serious profiling:

We hardly ever max out [CPUs]. … The messaging middleware app handles millions of messages per day and rarely takes up 10% of a single CPU on the servers it runs on.

Most of the CPU time used by our Ruby apps is spend waiting for IO. … about 80% of the time [for the middleware] is spent in the kernel in read() or write() syscalls.…

Ruby doesn’t have to be the absolute fastest at everything. Until you’ve tried it or not, though, you won’t even know if it’s fast enough for your problem. It might be—or it might not be. I’ve got some code that I’ve been hacking at work for our build process. I should have just made the programs (a collection of shell scripts) involved with Ruby. It would have taken me less time, left me something more manageable overall, and be more capable in the future. The downside? I would have had to make sure that Ruby is installed on all of my machines, which isn’t necessarily easy (I have some AIX boxes in the mix). Unfortunately, by time I really reached that conclusion, I had already spent enough time to make it non-economical to switch for at least another year (the programs don’t need to be maintained that often). I guarantee you, though, that the Ruby code would have performed just fine and it would have been easier for people to add features to than the shell scripts that I have.

Joel on Ruby
Friday, 1 September 2006

Posted by austin in: Ruby, comments closed

Joel Spolsky wrote a little piece today on web programming. In the main, I think he’s right. But he makes some statements about Ruby that are dead wrong. He says:

I for one am scared of Ruby because (1) it displays a stunning antipathy towards Unicode and (2) it’s known to be slow, so if you become The Next MySpace, you’ll be buying 5 times as many boxes as the .NET guy down the hall.

They’re wrong enough that the creator of the Ruby, Matz, has said something about it on the ruby-talk mailing list:

[W]e disagree in the middle.[…]

(1) Although we took different path to handle m17n issue from other Unicode centric languages, we don’t have any “stunning antipathy”.

(2) Although Ruby runs slower than other languages in those micro-benchmarks, real bottleneck in the most application lies in either network connection or databases. So we don’t have to buy 5 times as many boxes.

What can be said about Ruby and Unicode right now is that it shows apathy toward Unicode. It neither helps you nor hinders you; it leaves you in about the same place as if you were programming C or C++ without using ICU or something similar. The future is something different. I’ve done enough Unicode and m17n work to know that while Unicode is a great choice for new data, it doesn’t help you much with your legacy data—and all the languages which have chosen to be Unicode to the core have trouble with legacy-encoded data to some degree. Ruby’s m17n support is being planned to be legacy-encoding aware. That way, whatever encoding you have to deal with, it should properly be handled in Strings.

Further, Ruby may be “Known To Be Slow”, but what people Know is bunk. There’s long discussions about this on ruby-talk. There’s clear examples where changing one’s algorithm results in a dramatic performance improvement—to the point where people don’t care that the program is written in Ruby. Can Ruby’s performance improve? You bet it can. But anyone who tries to tell you that Ruby is Known To Be Slow is simply repeating outdated and inaccurate information and is not speaking from actual experience. Whether Ruby is too slow depends on what you are trying to do.

That said, I don’t use Ruby exclusively. I’m not paid to. Even if I were able to use it as an option, I know that Ruby is too slow for the specific domain problems that I have to solve at work. But for other problems (including some picture sorting that I’ve had to do at home and have written an article about), Ruby is as fast or faster than other tools.