We use a non-recursive GNU Make system in the company I work for. It's based on Miller's paper and especially the "Implementing non-recursive make" link you gave. We've managed to refine Bergen's code into a system where there's no boiler plate at all in subdirectory makefiles. By and large, it works fine, and is much better than our previous system (a recursive thing done with GNU Automake).
We support all the "major" operating systems out there (commercially): AIX, HP-UX, Linux, OS X, Solaris, Windows, even the AS/400 mainframe. We compile the same code for all of these systems, with the platform dependent parts isolated into libraries.
There's more than two million lines of C code in our tree in about 2000 subdirectories and 20000 files. We seriously considered using SCons, but just couldn't make it work fast enough. On the slower systems, Python would use a couple of dozen seconds just parsing in the SCons files where GNU Make did the same thing in about one second. This was about three years ago, so things may have changed since then. Note that we usually keep the source code on an NFS/CIFS share and build the same code on multiple platforms. This means it's even slower for the build tool to scan the source tree for changes.
Our non-recursive GNU Make system is not without problems. Here are some of biggest hurdles you can expect to run into:
The major wins over our old recursive makefile system are:
Regarding the last item in the above list. We ended up implementing a sort of macro expansion facility within the build system. Subdirectory makefiles list programs, subdirectories, libraries, and other common things in variables like PROGRAMS, SUBDIRS, LIBS. Then each of these are expanded into "real" GNU Make rules. This allows us to avoid much of the namespace problems. For example, in our system it's fine to have multiple source files with the same name, no problem there.
In any case, this ended up being a lot of work. If you can get SCons or similar working for your code, I'd advice you look at that first.