cLabs Blogki


//ComputersAndTechnology/SurgeonAnalogyTue Oct 07 2008 05:01 AM GMT
see DevelopmentAnalogies

Dave Hoover writes about a talk he gave on washing your hands:
This is something Uncle Bob started talking about 2 years ago: "It has become my position that TDD is a necessary discipline for professional developers. I consider it rather like sterile procedure for doctors. It's simply what you have to do to write professional code." ... I find this analogy particularly appropriate when you have an existing codebase and you've been asked to fix or enhance something. Like a surgeon you need to ensure you do not infect your patient, and in order to prevent that, you need to create a sterile environment. If you don't have tests in place, you retrofit them in order to prevent regressions (similar to cleaning a dirty patient).

He goes on to reference the "First, do no harm" principle amongst other observations.
 
 
//ComputersAndTechnology/MakingBadProgrammersCareSat Sep 06 2008 10:14 PM GMT

BruceEckel blogged this:
On the other hand, the WBC summit was really about psychology: why do programmers write poor code and don't seem to care about it, and what can we do to convince them to write better code and to care? At best, we succeeded in enumerating the problems that we had seen, so compared to the other summits we reached no conclusions. But perhaps the struggle was the important thing, and like Weinberg's PSL company simulation, we all needed to have these ideas inserted so we could struggle with them over the ensuing years.


I had a similar conversation with PragDave once. He related an incident with a programmer at a client that seemed 100% on board with unit testing, and then never actually did it. They'd go round and round, discussing doing it, and then it never happening.


Rick Kitts would like a "developer psychology book" (I'm quoting a comment he posted to his own article, not directly linked here...). Sounds like he wants one for the business people, too.




Luke Hohmann says:
Forcing a given team to adopt an approach that they don't believe in ... is a certain recipe for failure.
How do we change their beliefs?

Alistair Cockburn throws a log on the fire of despair:
Somewhere in here is the point I'm trying to make.
* One point is that there really are a bunch of good techniques and behaviors out there, and there are people who use them effectively.
* One point is that most people don't know of such techniques and can't use them.
* One point is that Oh So Very Many People don't have the motivation and energy either to research and learn the techniques, or to apply them. With the very many disastrous results that we see.
* One final point is that the last point limits our hopes of eventual success, whether with agile, with lean, with project management, product management, testing, architecture, etc etc etc. As long as the multitudes can't be bothered, it doesn't matter what great techniques some of us know.
So, in this sense, I'm done with the "60% projects fail, 60% features don't get used" attack (or defense).

Those failures can be ascribed to lack of caring. That has no known antidote.

As Bruce says, this is about psychology and ethics and morals. This is about the spirit and the soul.



see FaithAndBusiness.
 
 
//ComputersAndTechnology/TechnicalDebtThu Aug 28 2008 06:05 PM GMT
There was a technical debt workshop recently, and Michael Feathers and Brian Marick were encouraging switching the thought process from debt to assets, from a negative to a positive.

Off the cuff, I don't like this shift. I like the negativity of debt because I presume customers already see code written as an asset, and the struggle is in getting them to understand how code written that makes their UI work can still have maintainability problems. "You have debt, because this code is coupled and has no test coverage" to me is a more direct way of indicating problems, even though the app does its job on the surface. This resonates with me about how debt works in my life: sure, I've got a new shiny computer that's doing its job, but what's not reflected in the shine of my gadget is the negative balance on my credit card.

"You're assets have some deficiencies" (or whatever) I think leaves too much additional mental wiggle room: "well, that's fine - we just need it to work, we don't need it to be in tip-top shape."

I suppose Feathers and Marick have had different experiences with customers who shrug off debt and would be more impacted by discussions of assets with problems.
 
 
//ComputersAndTechnology/CodeIsTheDesignWed Aug 20 2008 03:10 AM GMT
"What is Software Design" is a classic article, so I'm not sharing anything new here, but I realized I didn't have it solidly stored in my blogki, so now 'tis. If you're not familiar with the article, I suggest giving it a read. It's a long one; in fact, I don't think I've ever read the whole thing. Really, I just like this bit:
There is one consequence of considering code as software design that completely overwhelms all others. It is so important and so obvious that it is a total blind spot for most software organizations. This is the fact that software is cheap to build. It does not qualify as inexpensive; it is so cheap it is almost free. If source code is a software design, then actually building software is done by compilers and linkers.
If you want to get into this some more, one place to start is at c2's page (the original wiki).
 
 
//ComputersAndTechnology/TheCaseOfTheInactiveMailerTestTue Aug 05 2008 06:16 AM GMT
I couldn't get actionmailer-2.1.0/test/mail_server_test.rb to run. I was just picking on it for research related to my DiyMocks article, and couldn't get it to work.

The first problem turned out to be a bug in the 0.9.0 version of mocha, which was simple enough to workaround, then the author sent me a patch.

But the test ran into a new problem after the mocha patch:

undefined method 'class_inheritable_accessor' for ActionMailer:Base:Class (NoMethodError)


I went and got git for Windows (many thx to those keeping us Windoze loozers on the fringe of the action) and yanked the source down. On master, the test passed. Changed to the release branch ("git checkout origin/2-1-stable") and the test also passed. Hmmm. Perhaps we had a deployment problem? Or maybe the git branch has been patched but not released?

I ran a WinMerge against the source folder and the deployed gem folder, and while I found some minor differences, nothing important showed up.

As Raymond Chen says, "Theorize if you want, but if the problem is right there in front of you, why not go for the facts?"

First off, if class_inheritable_accessor couldn't be found, where did it live? A Google Code search[1] turned up several hits. Clicking through the first hit led me to Rails diff, that showed class_inheritable_accessor was defined in /tags/rel_1-0-0/activesupport/lib/active_support/class_inheritable_attributes.rb, at least back in 1.0.0. Looking in the gem deployment folders for 2.1.0, showed no such file.

Browsing through more Google Code results showed activesupport-1.3.0 had this method defined in lib/active_support/core_ext/class/inheritable_attributes.rb. Lo and behold, there it was on my hard drive in gems/activesupport-2.1.0/lib/active_support/core_ext/class/inheritable_attributes.rb.

Now that I knew where it lived, I needed to find the disconnect between it and mail_server_test.rb. The stack trace was no help, because the error occurred only when the class under test couldn't find what it needed, not when the file wasn't loaded -- it's a non-event.

My next thought was to simply add a require statement in mail_server_test.rb to force the inclusion of inheritable_attributes, but it was obvious from the structure of the test files and their dependency on abstract_unit that the designers intended for abstract_unit to be a hub of requires, not mail_server_test.rb. And besides, on the source 2.1.0 branch, the test worked without that direct require; something else was amiss.

At this point, I could sidebar on Things Java Has Right That Ruby Doesn't. In Java I couldn't get away (barring use of reflection) with a reference to a method that's not been imported, and I would have to import that class directly in my test class (presuming here, that probably in the Java version of all this, we wouldn't be sharing package names). Viva la static typing, no? Fear not, Ruby will get its chance.[2]

Trouble is, because Ruby doesn't have the strict importing rules Java does, who knows where the missing link in the requires dependency tree is?

Well, come to think of it, the code in the pulled source knows.

I opened up inheritable_attributes.rb and added this line at the very top of the file,[3]

begin raise Exception.new; rescue Exception => e; puts e.backtrace; end

then ran the test from the pulled source. The stack trace showed me the path from mail_server_test.rb to inheritable_attributes.rb, so I switched back over to my local gems folder and started walking the path looking for anything out of the ordinary. In actionmailer-2.1.0/lib/action_mailer.rb, I found what I was looking for:

unless defined?(ActionController)
begin
$:.unshift "#{File.dirname(__FILE__)}/../../actionpack/lib"
require 'action_controller'
rescue LoadError
require 'rubygems'
gem 'actionpack', '>= 1.12.5'
end
end

Ahh, someone perhaps was a little too clever in working out how to require action_controller in both the source environment and deployment environment. Unfortunately, they overlooked one line, probably a simple mistake not caught by verifying the unit tests in a deployed environment:

--- a/actionmailer/lib/action_mailer.rb
+++ b/actionmailer/lib/action_mailer.rb
@@ -28,6 +28,7 @@ unless defined?(ActionController)
rescue LoadError
require 'rubygems'
gem 'actionpack', '>= 1.12.5'
+ require 'action_controller'
end
end

Loading a gem does not result in getting all files in that gem required. At least, not by default.

Now the test in question runs in its deployment folder, and I can get on with my anal retentive life.



Notes:

[1] Why not grep the local gems folder or somesuch instead of Google Code search? I dunno - no particular reason. Probably because Window's find command is not under my fingertips (and even if it was, it doesn't recurse?!?) and I was too lazy to use the grep-ish tools in my editors.

[2] Personally (probably due to laziness rather than wisdom) this is trade-off thing. I love the static analysis I can get in Java; I love the cool stack trace trick I used in Ruby to help me find the problem.

[3] Sure enough, I publish this, and then think - wait wait wait - I don't have to throw an exception to get a trace ... right? There's another way on the tip of my brain - but ... too tired to go research and remind myself. Must sleep now.
 
 
//ComputersAndTechnology/AgileDevelopment/TheKeyToSuccessWed Jul 30 2008 06:40 PM GMT
[New stuff (usually) at the bottom of the page, if you've read all this before...]


One thing that seems to stick in my mind the more I read about keys to successful teams is simply getting the right people.

JoelSpolsky recommends Facts and Fallacies of Software Engineering, by Robert L. Glass as a good summary of the core things we know about software development so far (which is interesting in light of SwebokAndLicensing, and I've heard some other heavyweights in the development world are not in love the Glass book). Number one in Joel's summary of the facts presented in the book:
The most important factor in software work is not the tools and techniques used by the programmers, but rather the quality of the programmers themselves.

Joel also has a great essay on how attempts to codify the successfulness of talented people just don't work.
1. Some things need talent to do really well.
2. It's hard to scale talent.
3. One way people try to scale talent is by having the talent create rules for the untalented to follow.
4. The quality of the resulting product is very low.



I was listening to a local sports station the other day interview John Gagliardi, who, on 11/8/03, became college football's winningest coach. A lot of the interview centered around John's coaching style, and the fact that he does a lot of things different (no tackling during practice, for example). The overall gist was that he didn't have a very detailed infrastructure for his team -- much of what a big college football program would do, he doesn't worry with. In fact, he summarized his approach this way (and I paraphrase because I didn't write down the quote being in the car at the time):
What you really need is good players. Good players don't need a lot of rules.

(Or structure: "On the best teams, different individuals provide occasional leadership, taking charge in areas where they have particular strengths. No one is the permanent leader... The structure of the team is a network, not a hierarchy." Peopleware: Productive Projects and Teams, 2nd ed., p. 155, Tom DeMarco and Timothy Lister -- via Dave Hoover's blog



1/12/04 - Bill Caputo posts this [quote is digestized]:
[Non-agilists] see nothing -- or very little -- in the XP process definition that will make people successfully deliver -- but its true of any process. [They] want a process that will make others succeed. Agilists believe this view of process is inherently flawed. No process makes you successful, but people who will succeed anyway can do so with a more or less painful process.

XP is simply the best collection of practices I have ever found that are generally useful in addressing the problems I try to solve on each project -- but in the end, I succeed because of me and the people around me, not our process, which we readily change in response to our current challenges.

[W]e don't claim XP will protect us from the harmful, we claim that it aids the successful.



1/22/04 - Jerry Weinberg, in an interview on Borland's site (via Esther Derby), chimes in with some soundbites on this topic:
What do you consider the most important thing for a programmer to do when he begins working on a new project?

I think each should be sure they are in good physical condition without nagging psychological problems.

What do you consider the most important thing for a programmer to do when he begins working on a project that has already begun?

She should get sufficient information to decide, before signing on, whether she should sign on. Most programming projects that fail have already failed before most of the programmers have signed on, but through lack of courage or due diligence, many programmers sign on anyway. It's like doctors agreeing to do surgery on corpses.

Would you recommend a career in programming to young people today?

It depends on what the young person wants to do. I always give the same career recommendation: "Do what you want to do."

What courses would you recommend they take? What languages/technologies should they key on?

They shouldn't key on languages and technologies. They should key on learning to communicate, to think, and to work well with other people. Once they have those, the languages and technologies become simple matters. Without them, no amount of language or technology expertise will do much good.



5/20/04 - Fast Company article on how good companies become great. The secret sauce? People. [via Clarke Ching]
Take David Maxwell's bus ride. When he became CEO of Fannie Mae in 1981, the company was losing $1 million every business day, with $56 billion worth of mortgage loans under water.

Maxwell told his management team that there would only be seats on the bus for A-level people who were willing to put out A-plus effort. He interviewed every member of the team. He told them all the same thing: It was going to be a tough ride, a very demanding trip. If they didn't want to go, fine; just say so. Now's the time to get off the bus, he said. No questions asked, no recriminations. In all, 14 of 26 executives got off the bus. They were replaced by some of the best, smartest, and hardest-working executives in the world of finance.

With the right people on the bus, in the right seats, Maxwell then turned his full attention to the "what" question. He and his team took Fannie Mae from losing $1 million a day at the start of his tenure to earning $4 million a day at the end.



11/11/04
[O]nly a virtuous people are capable of freedom. As nations become corrupt and vicious, they have more need of masters.

Source: Benjamin Franklin, The Writings of Benjamin Franklin, Jared Sparks, editor (Boston: Tappan, Whittemore and Mason, 1840), Vol. X, p. 297, April 17, 1787.

found at http://www.wallbuilders.com/resources/search/detail.php?ResourceID=21



3/25/05
MartinFowler has chimed in on this topic on his own blog:
If I had to pick one as my key to software development it's that the critical element in a software development effort are the people you have doing the work. The productivity of the best developers is far more than the average, much more than the difference in salaries.
He added another article in Feb 08 talking more about the productivity factor:
Although the technorati generally agree that talented programmers are more productive than the average, the impossibility of measurement means they cannot come up with an actual figure. So let's invent one for argument sake: 2. If you can find a factor-2 talented programmer for less than twice of the salary of an average programmer - then that programmer ends up being cheaper. To state this more generally: If the cost premium for a more productive developer is less than the higher productivity of that developer, then it's cheaper to hire the more expensive developer. The cheaper talent hypothesis is that the cost premium is indeed less, and thus it's cheaper to hire more productive developers even if they are more expensive.



August 2007
Alistair Cockburn:
People still trump process ... and theory, and ideas.

One of the great things I keep seeing (used to be "keep learning", but at least by now I half expect it when things blow up in my face), is how the individual chemistry between people operates outside of all the nice theory we construct.


Mary Poppendieck (via InfoQ), offers an interesting quote that seems to come across counter to the other quotes on this page:
We get brilliant results from average people managing brilliant systems. Our competitors get average results from brilliant people working around broken systems. - Fujio Cho, Chairman Toyota Motors
Elsewhere at Poppendieck's site, this paper gives more insight:
[Scholtes] says, "All of the empowered, motivated, teamed-up, self-directed, incentivized, accountable, reengineered, and reinvented people you can muster cannot compensate for a dysfunctional system...." So where does this leave us? Which is more important - process or people?

...

[The answer is both, "Process AND People"] ... People like to use effective processes, and they also like to have control over their own environment.... Process improvement may be done only "at the gemba" [the place of the problem] and it is up to the workers to decide whether or not a proposed improvement should be implemented.

Oct 2007
Joel weighs in again, this time via Inc.com:
Mistake No. 1: Start with a mediocre team of developers.
Designing software is hard, and unfortunately, a lot of the people who call themselves programmers can't really do it. But even though a bad team of developers tends to be the No. 1 cause of software project failures, you'd never know it from reading official postmortems.

...

At Fog Creek, we tend to review about 400 candidates for every full-time hire, because the best developers can be 10 times as productive as the merely excellent developers.

DaveThomas in his Herding Racehorses, Racing Sheep presentation at QCon London 2007 quotes Capers Jones from his book Software Assessments, Benchmarks, and Best Practices:
"Without excellent personnel, even good to excellent processes can only achieve marginal results."
 
 
//ComputersAndTechnology/ConductorAnalogyTue Jul 29 2008 08:40 PM GMT
see DevelopmentAnalogies

From Ted Neward:
At the risk of offering up yet another of those tortured metaphors, let me proffer my own architect analogy: an architect is not like a construction architect, but more like the conductor of a band or symphony. Yes, the band could play without him, but at the end of the day, the band plays better with one guy coordinating the whole thing. The larger the band, the more necessary a conductor becomes. Sometimes the conductor is the same thing as the composer (and perhaps that's the most accurate analogous way to view this), in which case it's his "vision" of how the music in his head should come out in real life, and his job is to lead the performers into contributing towards that vision. Each performer has their own skills, freedom to interpret, and so on, but within the larger vision of the work.

Is it a perfect analogy? Heavens, no. It falls apart, just as every other analogy does, if you stress it too hard. But it captures the essence of art and rigor that I think seeing it as "architecture" along the lines of civil engineering just can't. At least, not easily.
 
 
//ComputersAndTechnology/AccordionPortfolioTue Jul 08 2008 03:58 AM GMT
I'm scrubbing up my resume page, moving off the normal boring details to my LinkedIn profile, so I can build up a fancy portfolio with a whole lot more boring details.

For the fancy part, I decided to experiment with some JavaScript. I wanted to do a nested accordion bit, and my initial googling was not too encouraging. I did find some libs that would do nested accordions and started with one I found at Dynamic Drive that works with jQuery. Probably more due to my novice JavaScript skills than the lib, it seemed awkward to work with, so I hunted for another one.

I ran across a scriptaculous lib made by stickmanlabs and started having better success. At the same time I realized that trying to inline the amount of data I had into html myself was becoming a beating, and my code gen nose started twitching. Now I've got a Ruby script with a YAML data section at the end of it that'll spit out all the div markup I need to work with stickmanlabs Accordion v2.0.

So far so good, except I'm pushing the library a bit beyond the demo by having many nested accordions (which might be a smell that my UI approach here for all this data isn't a good one). The library expects only one nested accordion, and for that div to have the id "vertical_nested_container". I actually blew right past that while building things the first time (even though Tidy complained about my having many unique ids that are not actually ... well ... unique), and things worked fine ... in Firefox and Safari at least. But IE7, well, wasn't so kind in accommodating my many redundant unique ids.

My first thought was to try and change the id to a class, and see if that works. It didn't, because of how the code in the accordion.js library is structured for initializing the accordions. Since my JS hacking skills are limited for the time being, I realized the shortest way home at this point was to make separate ids for each nested accordion.

Since I'm already doing code-gen for the div section of the page, changing to unique ids in that section is a breeze. The problem then becomes, what to do with all of the initialization code? Here's the initialization code for one nested accordion:

var nestedVerticalAccordion = new accordion('vertical_nested_container', {
classNames : {
toggle : 'vertical_accordion_toggle',
toggleActive : 'vertical_accordion_toggle_active',
content : 'vertical_accordion_content'
}
});

Not bad for one, but I've got 8, and that sort of repetition is not programming. Since I'm already generating the div sections, why not generate the javascript initialization section? A little ugly (and it makes for a longer page with the repetition), but it'll work. I did look a little into writing some init code that would scan the DOM for everything I needed, but I needed a getElementById with a wildcard or somesuch and I couldn't find anything out of the box for that sort of thing.

So - wire up the new code-gen, and ... well it doesn't quite work. Well, it works, but it doesn't look right. That's when I remember the css - it has special formatting for the nested according based on id. If I'm going to code gen 8 separate nested accordions with 8 separate ids, then I'll now need 8 separate css entries for each id. Sigh. Code gen's cool, but this seems to be getting out of hand. Fortunately, after poking around in Firebug for a bit, I realized I could put the styling into the tag itself.

At this point, things are looking good. Nested accordion support without modifying the original library, and taking care of DRY issues through code gen.

Things were working pretty well now, and I needed a color scheme makeover. I Googled around and found kuler, a nice tool for coming up with color schemes. I fed it the background color of my site as a base color, and through various experiments came up with a scheme that I liked.

I started filling up the accordion with content, and as I started accruing a bit of text, I start noticing an annoyance with the presentation. Many times, an expanded accordion of text wouldn't be visible without having to scroll to view it all. This brought to mind a recent 37signals post showing how they added autoscrolling to Backpack to make sure when a selected action would reveal content below the scroll, the page now scrolls the new stuff into view.

Searching for a way to do this with scriptaculous was a bit frustrating at first, none of the effects listed in the wiki docs included the base ScrollTo effect. They recently moved their wiki to github, so maybe that's what happened to it. Fortunately, I kept plugging and eventually ran across other mentions of this effect, which I found in the source. After losing way too much time experimenting with adding this effect inside the accordion.js library, it suddenly clicked and started working -- I wasn't feeding it the correct element for a while.

I was much happier with the accordion with this effect added in. Makes it pretty smooth to work with. There are still a couple of issues I'd like to tighten up:

- The effect is always triggered, even if there's no need to scroll to the div, so after a while it's a little annoying to see a small page scroll take place to move something which was entirely in view a few lines to keep it still entirely in view.

- Sometimes, especially in the top sections of the overall accordion, the title is scrolled out of view for some reason.

And one last issue that I'd like to correct for multiple nested accordion support:

- Opening a section inside a new nested accordion doesn't close anything in the prior nested accordion.

So ... we'll see where I go from here, be nice to get rid of all the code gen, fix these existing issues and submit some patches to stickman.
 
 
//ComputersAndTechnology/TheCaseOfTheIntermittentBufferMon Jun 23 2008 05:23 AM GMT
At my current gig, we have a custom Java .jar that we also cross-compile to .NET with IKVM. Recently, we ran across a bug that occurred only intermittently with the .NET assembly (and very infrequently - maybe 3 times out of a 100). The .jar never had the problem. Here's some example code that demonstrates the issue:

package org.clabs.bufferexample;

public class Processor {

private IDataProvider data;
private StringBuffer dataRead;

public Processor(IDataProvider data) {
this.data = data;
}

public void execute() {
byte[] buffer = new byte[10];
StringBuffer stringData = new StringBuffer();
while (true) {
int bytesRead = data.read(buffer);
if (bytesRead == -1) {
break;
}
stringData.append(new String(buffer));
}
processXML(stringData.toString());
}

private void processXML(String string) {
// pretend this is real code
dataRead = new StringBuffer();
int fromIndex = 0;
while (true) {
int index = string.indexOf("foo=", fromIndex);
if (index > -1) {
String s = string.substring(index + 4, index + 5);
if (dataRead.length() > 0) {
dataRead.append(",");
}
dataRead.append(s);
fromIndex = index + 4;
} else {
break;
}
}
}

public String getDataRead() {
return dataRead.toString();
}
}


Here's a passing test showing the code works fine:

package org.clabs.bufferexample;

import java.util.ArrayList;

import junit.framework.TestCase;

public class ProcessorTest extends TestCase {

public class MockProvider implements IDataProvider {

private ArrayList<String> reads = new ArrayList<String>();
private int readsIndex;

public MockProvider(ArrayList<String> reads) {
this.reads = reads;
readsIndex = 0;
}

public int read(byte[] buffer) {
if (readsIndex < reads.size()) {
String thisRead = reads.get(readsIndex++);
byte[] theseBytes = thisRead.getBytes();
System.arraycopy(theseBytes, 0, buffer, 0, theseBytes.length);
return theseBytes.length;
}

return -1;
}
}

public void testBufferedRead() {
ArrayList<String> reads = new ArrayList<String>();
String data = "<data><bar foo=1/><bar foo=2/></data>";

reads.add(data.substring(0, 10));
reads.add(data.substring(10, 20));
reads.add(data.substring(20, 30));
reads.add(data.substring(30, data.length()));

MockProvider mockProvider = new MockProvider(reads);
Processor processor = new Processor(mockProvider);
processor.execute();
assertEquals("1,2", processor.getDataRead());
}
}


(Since the production IDataProvider is essentially an InputStream, the MockProvider here is setup to hand over the data in chunks via multiple calls to read(byte[]). Also, the real code is dealing with much larger chunks of data, and uses a 1KiB buffer.)

Trouble with this test, is it's not real enough. The test passes, in Java or .NET 100% of the time. "1,2" is always returned, but sometimes in production, when the problem would occur, we'd only get "2". So ... where's the problem?

After looking over this a couple of times, my eye was drawn to these lines in the production code:

int bytesRead = data.read(buffer);
if (bytesRead == -1) {
break;
}
stringData.append(new String(buffer));


When the read method reaches the end of the data, the buffer isn't always going to be full - only if the total amount of data is a multiple of 10. Yet the last line in the above snippet appends the complete buffer everytime. So, the last time through, there'll be some leftover bytes from the prior call added to the end of the data.

That would explain the intermittency wouldn't it? Well ... no. Even in our example code, the data would have to be a multiple of 10 most of the time, and the production data was not guaranteed to have any consistency in length, so it'd probably avg. out to 90% of the time there being garbage added to the end of the data. In the production code, the buffer is 1 KiB in length, which means on average, only 1 out of 1,024 times would the data align itself just so. And our test, which doesn't have aligned data, is passing 100% of the time.

Breaking out the debugger at this point proved futile, since the bug would never occur in the debugger. This could either be because it happened so rarely, or because we had a heisenbug on our hands**. It was time for some old school debugging.

I added some logging to the code:

...
private void processXML(String string) {
// pretend this is real code
System.out.println(string);
...


...then setup a Ruby script which would run the code in question over and over again, parsing the console output and breaking whenever the console output was not what it should be.

Here's the correct string when output:

<data><bar foo=1/><bar foo=2/></data>2/>


(There's the garbage at the end, that doesn't seem to cause problems for our parser).

And here's an incorrect string when the bug was finally caught in the act:

<data><bar fata><baroo=1/><bar foo=2/></data>2/></


Ahhh ... interesting. Now we're getting somewhere. Naturally, with that string, we'd only be able to parse out the foo=2, since somehow the foo=1 was corrupt. And, I realize I'd been a bit short-sighted with the flaw in the buffer append.

Some more debug output to confirm my new thought:

int bytesRead = data.read(buffer);
System.out.println(bytesRead);


New console output now when it works:

10
10
10
7
-1
<data><bar foo=1/><bar foo=2/></data>2/>


And when it doesn't:

10
2
10
10
5
-1
<data><bar fata><baroo=1/><bar foo=2/></data>2/></


Proof of my short-sightedness.

Earlier, I'd stated, "When the read method reaches the end of the data, the buffer isn't always going to be full." True, but incomplete. The read method is never guaranteed to return the full buffer of data.

Here's a failing test to flush out the problem:

public void testBufferedReadJagged() {
ArrayList<String> reads = new ArrayList<String>();
String data = "<data><bar foo=1/><bar foo=2/></data>";

reads.add(data.substring(0, 10));
reads.add(data.substring(10, 12));
reads.add(data.substring(12, 22));
reads.add(data.substring(22, 32));
reads.add(data.substring(32, data.length()));

MockProvider mockProvider = new MockProvider(reads);
Processor processor = new Processor(mockProvider);
processor.execute();
assertEquals("1,2", processor.getDataRead());
}


The production code in question was reading data over HTTP. Apparently, the code in Java never performed in such a way as to loop fast enough over the read calls to get a less-than-full buffer in the middle of the run, but the IKVM'd code in .NET would occasionally.


** We had a heisenbug. The reason we could never catch this in the debugger was because it slowed down the loop calling read over and over, and there was always a full packet of data waiting to be read by the time we stepped through that line of code.
 
 
//ComputersAndTechnology/OptimizingVisualStudioForTDDMon May 19 2008 01:00 AM GMT
At my gig we've got a large Visual Studio solution - 37 projects at the time of this writing. Many of these are small library assemblies, and roughly half are unit test assemblies. It performs rather poorly. Even though Visual Studio is good about checking to see if it needs to build a project (except in a Rebuild, of course), it still appears to require some timestamp checking on source files each time, which is enough disk activity across all the projects to add-up. Plus any pre- and post- build steps are always run -- so having a minute or so build time prevents any pace required for some good TDD.

It'd be nice if it'd keep some dirty markers in memory or somesuch as I code, to just know ahead of time what should or shouldn't need to be built, and then skip the pre/post steps as well. Don't get me started on how it should really go the next step and do background compilation.

I've been tinkering with this recently, looking for a way to improve things, and I think I've come up with a way to have my cake (one fat .sln file with easy access to all the source code for refactorings and analysis) and eat it too (run my unit tests fast and get me in a good TDD groove).


Working with ReSharper, I'd noticed running unit tests for assemblies low in the dependency tree wouldn't take very long. If there's a way to eliminate building the unnecessary dependencies for all tests - well - that'd be yummy.

The first option I pursued was setting up separate solutions for each pair of assemblies - the production assembly and its unit test assembly - and changing all of the project references from the production assembly to file references. This works for increasing the performance, but there are some cons with this approach which cost too much for the performance increase:
- Lots of new .sln files added to source control.

- There's no source code available for the dependent assemblies (with .pdbs, the source can be stepped into during debugging, but it can't be browsed easily, and I don't think R# can work with any of it).

- In order to work with source across multiple assemblies, multiple instances of Visual Studio must be used.

- ReSharper's quick doc view isn't available for code in the file referenced assemblies.

My cohort, Tony Mocella, suggested staying in one big .sln, but adding new build configurations, a separate one for each assembly pair, where only the two assemblies are checked for building. I tried it. It performed well and it has fewer cons than the separate .sln file approach...:
- No crufty increase to files in source control.

- All the source code for all assemblies is available.
... but it still had some cons I didn't like. First, it adds a bunch of configuration cruft to each project file, separate entries for each pair's configuration. This makes the guts of the project file seem very un-DRY - but only when looking at it in a text editor, which ain't a common task. The DRY violations can come back to bite if something like optimization needs to be toggled in each and every one of those configs.

In addition, I found switching between configurations to be slow - 45 seconds to a minute to switch. Less time than firing up a second (or third) instance of Visual Studio like the previous option requires, but still enough to break a rhythm if working in more than one assembly is required.

Tony's idea is pretty good, but once I'd already tackled trying these two options, it turns out I'd already taken care of the core problem: project references. By replacing all project references with file references (except for a project reference from the unit test assembly to its assembly under test), now building each production assembly in isolation took only the time necessary to build that one assembly. Sweet. Without implementing either of our original ideas - additional .sln files or additional build configurations - I got the performance improvement I needed, without any downside.

Except for one thing.

Try doing a clean build of the whole solution. Whoops. Without the project references, Visual Studio doesn't have an accurate way to determine the build order.

Staring at my monitor, cake in hand, unable to take a bite, my mind got desperate.

Then an idea. What if I open the .sln file in a text editor and re-arrange the projects in the order they need to be built? Surely, if Visual Studio has no project references to determine order, it won't disturb the order it read them in from the file?

Sigh. Not so much.

But I found a bit of a silver lining. While Visual Studio won't obey the order of projects in the .sln file, the following macro will:

Dim aProject As Project

For Each aProject In DTE.Solution.Projects
DTE.Solution.SolutionBuild.BuildProject("Debug", aProject.UniqueName, True)
Next


Put a keyboard shortcut on this macro and I'm good to go. So, all said and done, my cake doesn't have any frosting on it, but it'll do.



There's some details I've glossed over so far. In a large solution, changing all the project references to file references is a pain. But I found opening the project files in a text editor with good macro support was much faster than doing it all through Visual Studio.

I changed the output folders for all projects to be the same folder in both the Debug and Release configurations, and pointed the file references to each project's output folder. I suppose a single common folder could be used across all projects, but I didn't play with it. A single shared output folder could be more convenient. For example, if I'm doing TDD in a dependent assembly, then want to run the entire application, in my current setup, I have to build assemblies further up the dependency chain for the copy local actions to take place. Seems a single shared output folder wouldn't have that problem.

After first discovering the build order in the solution was toast, I experimented with establishing manual dependencies in the Project Dependencies dialog, but these dependencies are the source of the performance problem. Might as well go back to project references.

The macro as listed above has a problem - if a particular build fails, the macro continues on building the rest of the projects. This is just like Visual Studio, except unlike a regular VS build which accumulates error reports for all project failures, each project build in the macro is a separate action, and the error output is reset in between each build. If a project halfway through the solution fails, many of the rest may fail too, but the original error output will be lost. To fix this, there's a macro post build event you can hook into, to set a variable that can be checked in the middle of the original loop. See the "Canceling a failed build" section in this article at the Visual Studio Hacks website for details.

This is still a fairly kludgey approach - if you've any ideas for improving upon, I'd love to hear about it.
 
 
//ComputersAndTechnology/ApacheLDAPAndActiveDirectorySat Apr 26 2008 05:13 AM GMT


Recently at my gig, we made the switch to Subversion (just in time to find all the cool kids had vacated the premises for the latest hip club). I got wrapped up in setting the thing up, and when it came to authentication, our IT dudes reasonably requested we use the company's Active Directory setup.

It didn't sound too difficult, but turns out getting Apache's LDAP mod to talk to Active Directory can require a bit of arbitration.

Apache's doc for mod_authnz_ldap (2.2) covers the basic directives:
AuthLDAPBindDN
AuthLDAPBindPassword
AuthLDAPUrl
The first two are needed if you can't anonymously bind to the LDAP database (which you can't by default with Active Directory). The URL has a special format, detailed in RFC 2255. One destined for Active Directory probably looks like this:

ldap://adserver.fooco.com:389/DC=fooco,DC=com?sAMAccountName?sub?(objectClass=*)

This essentially means, "search the ldap (ldap://) server on adserver.fooco.com, port 389, starting at the root (DC=fooco,DC=com), comparing the search term against the value in the sAMAccountName attribute, recursively (sub), querying all objectClass types (objectClass=*).

(The SAM Account Name is essentially the user's domain login name. More details here. It will be unique within a domain. If your Active Directory has more than one domain in it, and both domains have the same SAM Account Name in it -- well, I guess you'll need to get a bit inventive with your LDAP searches. The search will return more than one result and Apache's LDAP mod won't be able to authenticate the login).

And we've now hit our first hiccup: referrals.

This long thread helped get me on the right track. This Technet doc (search for the LDAP Referrals section) explains:
When a requested object exists in the directory but is not present on the contacted domain controller, resolution of the object name depends on information that is stored on that domain controller about how the directory is partitioned. In a partitioned directory, by definition, the entire directory is not always available on any one domain controller.

... When an operation in Active Directory requires action on objects that might exist in the forest but that are not located in the particular domain that is stored on a domain controller, that domain controller must send the client a message that describes where to continue this action — that is, the client is "referred" to a domain controller that is presumed to hold the requested object.

... Active Directory returns referrals in accordance with RFC 2251.
(Other text I didn't quote in that article indicates that Active Directory will only respond with referrals when necessary. In my recent (and limited) experience, it seemed to always return referrals with a root search, though it seems logical to always be necessary with a root search).

Apparently, Apache's mod won't follow the referrals. A bug report has been filed on the problem, with a fix already committed that'll be there in the next release. But there's good news in the meantime.

First off, if you don't need a directory-wide search, you can restrict your search URL to a specific OU or other path, and referrals won't be needed. For example, if you only need to authenticate against users in an OU named bar, change the URL to:

ldap://adserver.fooco.com:389/OU=bar,DC=fooco,DC=com?sAMAccountName?sub?(objectClass=*)


If you do need a directory-wide search, you can use Active Directory's Global Catalog. Global catalogs are typically hosted on port 3268. From the prior Technet article, search for "Global Catalog Searches":
The global catalog enables searches for Active Directory objects in any domain in the forest, without the need for subordinate referrals. Users can find an object of interest quickly without having to know which domain holds the object.
NOTE: I tried going to the global catalog early on in my configuration attempts and still had problems. I didn't find out why, but I did come up with a workaround - switching to another AD server. Use ipconfig to show your DNS servers, and ping the ip addresses - those will probably be AD servers on your network.

[I intend to cover some more material in this article, but I've got to sign off for tonight. Coming topics: using mod_authn_alias to search multiple directories, or multiple query locations in the same directory, mixing LDAP authentication with other types, and restricting Directory entries to certain ldap groups, etc.]
 
 
email
subscribe

rss

cLabs

cuber mer

blogs i read