January 2011, Mt.View campus
Visited Google first 5 years ago, that time was rejected. This time contacted by recruiter again, offered to come for the interview again.
Sweeping Silicon Valley rumor mill brought several facts about interview practices – each interviewer has to submit rather detailed report to the hiring committee recording after you everything written on the white board and allegedly even recording what you say. Sometime 2 interviewers are talking to the candidate, in such cases second person is a trainee. Sometimes there are two rounds if Google is uncertain. Candidate has to select programming language of preference for the interview (my choice was C as I was applying to what recruiter called “kernel” positions in ChromeOS team).
Was first met by the recruiter, he spent 10 mins with me, I asked about the practices above, even though he confirmed, but downplayed "exact recording of what candidate says part". Answered that written report on a candidate is about 1-2 typed pages on average.
Total met 6 people, 5 technical, one lunch
First interviewer:
1. General discussion on testing methodologies (why unit tests might fail?)
2. Asked me to code review some C++ code even though I explicitly selected that my language of preference is C. Couldn't say much as I don't use C++.
3. Asked to implement C++ method for a class "arbitrary reader" given we have a class 4K reader which can read only 4K blocks.
Second interviewer:
1. Given two binary search trees, write function which tells if two such trees are the same – (i.e. same info in the nodes, same branching to left and right at every node).
Recursive implementation was presented. Was asked to do non-recursive version (question was not about better solution, say, w/constant memory) – I proposed to change recursive algorithm to non-recursive with a user-defined stack (formal transformation rather boring but doable, FORTRAN people do it). Demonstrated that but code was messy with gotos, etc. Asked about complexity – both recursive and non-recursive versions required linear stack in worst case of pathological one-sided trees.
Third interviewer:
1. Reverse bits in 32-bit integer.
2. Given two large files with 64-bit integers produce file with integers which are present in both files and estimate O(?) complexity of your algorithm.
Proposed solution of splitting both files in n chunks (k integers in each) and solving n^2 subproblems while incrementally updating the result by merging sorted files of partial results (we will need to sort each chunk once). After algorithm was finalized, we checked the complexity and it was something cubic by n and k. Too bad. Navigated me to the solution to sort both files and find intersection by merging, this was better. As an afterthought I still think that when files do not have large overlap my solution might be winning as sorting original large files is not required (k should be selected so that each chunk could be sorted in memory, how about n=k=sqrt(file-size)?:).
3. One has a stream of queries coming over a day, how do you keep a cache of 1000 most-frequently used ones? Did not have time for this one.
Fourth interviewer:
1. Asked light technical questions on past items listed in resume.
2. When you are doing "read()" call what it happening under the hood in user space? (answer needs to clarify how it is translated to a system call by libc).
3. How to find a median in a large file of 64-bit integers.
I mentioned first the classical "order statistics" algorithm with splitting initial group into 5-tuples and finding median of medians, etc. My proposal on the problem was toadopt median algorithm to the case when we are dealing with large file – however no additional disk space could be used, it should be done "in place" and it is not clear for me if order statistics algorithm could be done easily with such restriction. Interviewer downplayed the order statistics deal and navigated me to the solution that I liked – keep in memory counters for binary intervals practically counting occurences of prefixes (say, 24-bit prefixes). After that we are able to tell what prefix range the median belongs to, recalculate index of the element to search instead of median and go over the particular prefix range and find it there. Sweet.
After two week got rejection as not matching with current open positions.