Friday, August 12, 2011

Tenate Quaestionem: Deconstructing a Simple Coding Interview Question

I have had the pleasure of interviewing at a few companies recently and one thing about my coding-problem-solving style struck me as worth mentioning and explaining. Having conducted quite a few interviews myself (300 or so) I appreciate when candidates make a concerted attempt to understand the problem posed. Perhaps this is because it's something I do myself (we hire people like ourselves, unfortunately, and quite unconsciously) but I also think it's a good practice.

Here's a thorough example.

An engineer asked me to "write a method which takes two Date parameters and returns true if the first is one month after the second, otherwise false". (Lightly paraphrased.) He said I had the freedom to assume that Date objects have convenience methods I might need. Obviously, I immediately wrote


bool IsSecondDateOneMonthLater(Date a, Date b)

{
return a.IsOneMonthEarlierThan(b);
}


And just as obviously he said that was stretching it too far. So I concentrated on learning the true nature of the problem, because it was greatly underspecified. I needed to know many things about what the method was really supposed to do and how it should behave for its customers. I started asking questions.

Are the Date objects in the same calendar?

Motivation: a lot of rules and heuristics would be needed to transform one Date into a compatible calendar so I could do the calculation; assuming the transformation is possible; also assuming that it's really a valid thing to do.

Answer: yes.

Are they in the Gregorian calendar?

Motivation: I know this calendar best. Other calendars may have eccentricies I don't understand well, or which are difficult to calculate. More further down.

Answer: yes.

Are the Date objects guaranteed to be valid? For instance, could one Date be set to a strange day such as February 55th, or be set before the beginning of the calendar, such as Gregorian year 1210?

Motivation: determine how much error checking I must do. If dates will not let themselves be initialized or set to invalid states, then I can count on them to be in sane states.

Answer: yes.

Are the dates in the far future or far past?

Motivation: I'd like to know if I need to worry about special circumstances such as the distance between two years being too large to fit in some number types, or being so close to a boundary that overflow or underflow become a problem. For example, if there are more than 2^64 years between the dates, then I need a special number type to contain and manipulate them. Is it likely I'll be asked to work with dates more than 1.8 * 10^19 years apart? I don't know what this method is used for, so I don't know. It could be for cosmological modeling.

Answer: the dates are in the near future, don't worry about it.

What is a month?

Motivation: It could be a solar, lunar, or calendar month. The time of day could be important, or not. The location could be important, or not. Examples: 8:15AM January 2 and 8:16AM February 2 might, or might not be a month apart, that's a policy decision. I did not ask about location (translating to time zone) but I should have. Due to the dateline, the same GMT, in local time, could be on different days. If location is not a factor, or the Dates are normalized to GMT, I don't have to worry about that. I should have asked that, but didn't. My error.

A month could reasonably be a lunar month. That's a lot more complex, and if so I need to know whether the Date objects have enough information to make this determination.

Answer: the same day number in the next month is the only deciding factor.

Are leap years a factor?

Motivation: This helps clarify his intention. February 2 to March 2 is a month regardless of the varying number of days, right?

Answer: no.

At this point I am a little unsure of this definition, because it means that there's no second date that could be one month from October 31st, ever, there not being a November 31st in any year in the Gregorian calendar. But I run with it, so I write some examples, then write something quite like this code:


bool IsSecondDateOneMonthLater(Date a, Date b)

{

// check that a and b are both not null

// for randomly selected close dates, this is most likely to fail first, so try it first

// to save time.

if (a.Day != b.Day)
return false;

if (a.Month == 12) // December, special case
if (b.Month != 1)
return false;

if (a.month != 12)
return (a.Year == b.Year)
else
return (a.Year == b.Year -1);
}

Now, if it hadn't been an interview, I might have made this code a bit neater, such as:

{
return ( (a.Day == b.Day)

&&( b.Month == 1 ? (a.Month == 1) : b.Month == a.Month)

&& (b.Month == 1 ? (a.Year == b.Year -1 : a.Year = b.Year) );
}


or this:

{
if (a.Day != b.Day)
return false;
if (b.Month== 1)
return (a.Month == 12 && a.Year == b.Year -1);
else
return (a.Month == b.Month && a.Year ==b.Year);
}

They both have their own charm. But it was an interview, and I wrote for simplicity and ease. Writing code on a whiteboard is unnatural, so if I can save brain cycles, I do. I ran the code through my test cases, it worked, and I declared it done.

These questions are summarized - I rephrased a few to make sure I and the interviewer both understood, repeated the question back to him, and asked a few miscellaneous questions. I am sure it added up to 15 or 20 questions. I doubt it was always obvious why I was asking, and sometimes I explained. There's time pressure in interviews.

Is this what the interviewer wanted, or did he want me to just make a bunch of assumptions and bang out some code? I asked whether he felt I was over-analzying the problem or should hurry through it, but he said no. I don't know whether he was being polite or this was his actual expectation. Regardless, I did it my way, and I solved the problem. The end solution was trivial. Getting there was not.

That's a lot of my job. Taking complex problems, usually vague and under-specified, and turning them into clear problems that can be solved, then solving them (or showing someone else how to). Usually fairly simply, but not always. The not-always has led to a number of patents, but that's another story.

No comments: