My Machine Learning Research Jobhunt
In the last few months, I interviewed at a number of companies in Europe for an AI research position. I wasn’t able to find much information online, so maybe sharing my experiences will be useful to others.
For context: I just finished my PhD, and have publications at all the big Deep Learing conferences (NIPS, ICML, ICLR), and in journals that specialize in ML application fields. I come from a no-name university, my h-index is somewhere in the 5-15 range and my citation count is somewhere in the 500-1500 range. I’ve done an internship at a FAANG AI research lab before. This is the first time I’ve been job hunting, so I might have missed a few obvious things, or state things that are obvious to people who’ve done this before – sorry.
In my experience, almost all advice about interviewing for Software Engineering jobs transfer very well to ML research positions, as those jobs are at the same companies. I cannot recommend the following blogs enough, they’re a gold mine and I recommend reading them once before you even start applying to companies, once more before the actual salary negotiations start, and a last time while you’re in the middle of your negotiations:
I only interviewed at industrial research labs in global corporations. No start-ups, no smaller companies. Also, I restricted my search to Europe, pay scales & benefits are different elsewhere. There are a lot of great companies that hire AI scientists in the region: Amazon has an ML research lab in Berlin, Apple has ML jobs in Paris, Google Brain has labs in Amsterdam, Berlin, Paris and Zurich, Facebook AI Research and DeepMind are both present in London and Paris, Microsoft Research and IBM Research both sit in Cambridge and Zurich, nVidia has labs in Berlin, Helsinki and Munich, Twitter has an office in London, and Uber is hiring Research Scientists in Paris. Aside from this, there are a lot of interesting positions in more applied fields, such Automotive (VW/AUDI have a research lab in Munich), Drug Design (Benevolent AI sits in London, and Merck, Bayer, J&J and Astra Zenica are all looking for ML people) or Finance (Citadel and Jane Street both have offices in London). Wandering job booths at conferences you’ll also encounter some other very interesting research positions at cool companies. E.g. Bloomberg (London), Bosch (Stuttgart), Criteo (Paris, Grenoble), or Disney Research (Zurich). And there’s likely a lot more out there that I did miss.
While I did have some favorites in mind, I applied to as many companies as possible: without competing offers, you will be at a severe disadvantage in your salary negotiations. And the interview practice helps a ton. The difference in adrenaline between my first interviews vs my last ones was similar to giving a talk in front of 3k people vs talking to your colleagues during lunch. Additionally, you will definitely bomb some of your interviews – either you or the interviewer will have a bad day, or you get questions about one of your blind spots, or the position turns out not to align with your interests, or something else goes terribly wrong. For example, for reasons unknown to me, nvidia ghosted me in the middle of their interview process: their interviewer just didn’t show up to a pre-arranged video conference call, and they’ve been ignoring all my emails since. I still have no clue why. But I was glad I didn’t put all my eggs in one basket.
However, I discovered that the main advantage of interviewing at more companies was something else: I was getting to know more companies. There are a lot of cool jobs out there that I didn’t even think of! Some of the most interesting positions (and best offers) were at companies that I didn’t initially consider top choices. It turns out that some my “safety picks” were a really good fit for me. Even if that might not be the case for you, getting to talk to a lot of teams about their current projects and their visions for the future is very inspiring and illuminating.
I applied to about half of the companies I listed above, either for Research Scientist or Research Engineer roles, and most of those applications ended up in job offers. My whole job hunt took a long time (half a year from my very first application to me accepting an offer), and was very exhausting: those months were a blur of airports, hotels and interview rooms, followed by months of phone calls and salary negotiations with HR people. Don’t expect to get much work done during this period. As a colleague put it: “your mind is constantly preoccupied with when you’re hearing back from this or that recruiter, there’s no mental capacity left to think about ICML”. Yet it all paid of tremendously: I learned a lot, collected a lot of new perspectives, and was able to negotiate a much higher salary than if I’d gone with the first “dream job” offer.
The interview process was very similar across all companies. After receiving my CV, the companies would invite me to do a brief screening to see if I was a suitable candidate, usually in the form of one or two phone interviews of one hour each. Then, I usually got invited for on-site interviews: a full day of interview after interview with various people, at the company’s offices. These usually started with me giving a presentation about my PhD research. Followed by individual interviews of ~1 hour each, either with people from the team I was interviewing for, or with researchers/engineers in similar roles from other teams. Typically, I’d meet different people for each interview, so by the end of the day I had often met most of the people in my potential future team. Almost all interviewers made time for me to ask questions about the position, team or company. I liked asking questions about work-life balance, difficult processes or about things they didn’t like at their current job. I found that most people would answer these questions honestly and bluntly, which led to a number of poignant, preposterous, pathetic, and priceless insights into what my future might look like; from funny stories about the office dog or the team lead who assured me that “we do work hard, but you’ll probably be able to cut back to less than 60h / week in your 2nd or 3rd year here” or the guy who told me he was under so much stress that he’s thinking about quitting (while still assuring me that “the office is really great, you’ll love it here!”) to the guy who was so excited about his own research that he forgot to ask me any technical questions, and instead just stream-of-conciousnessed his current breakthrough to me… These were great opportunities to learn more about the company and the position I was interviewing for. Definitely do take the time to ask some good questions!
Types of interviews
There’s a couple of different types of interviews that I’ve encountered over and over again. Some of these are easy to prepare for (e.g. coding or behavioral interviews), while it’s almost impossible to prepare for others. All companies had a different mix of interviews: several companies made me an offer without ever verifying that I could code my way out of a paper bag (i.e., no coding questions), while others never verified that I knew what an expected value was (i.e., no math questions). Some gave me more theoretical problems, others more practical ones; at most companies, it was a mix of both. In general, I found that people were always willing to give me hints if I was stuck. I think they often under-specified some of their problems on purpose, just to see how I would react, and were eager to help or discuss details with me. It never felt like an adversarial process, it was more like a discussion between colleagues.
“Tell us about your research”
A lot of interviews simply consisted of me talking about my past work. The interviewer would pick a paper from my CV and ask me to talk about it, or sometimes they would let me pick which project I would like to talk about. Some interviewers would only ask shallow questions, while others went fairly deep into it (“You assume heteroscedasticity in theorem 3, but never justify it throughout the paper. Why did you think this is a valid assumption, and what are its implications?”). But we never went “math-level deep”: there are one or two tricky proofs in papers I’ve co-authored that I dreaded being brought up; but luckily all interviewers were as afraid of talking about such things as I was. I was mostly asked about papers that I first-authored, but people didn’t mind me discussing papers that I merely co-authored, either.
The typical software engineering interviews that Google or Facebook are so famous for: you need to come up and implement with the solution to some puzzle of algorithmic nature. Usually in a language of your own choosing, e.g. C++ or Python. There were several iterations of each problem: you first come up with a simple solution and implement it, and then the interviewer gives you additional restrictions or asks for a more efficient solution. Afterwards, people would usually expected me to discuss time or memory complexity, or to discuss potential test cases for my implementation. A lot of the times, the discussion would then go towards even harder versions of the same problem. in several instances, interviewers admitted afterwards that they themselves didn’t know how to solve the last iteration of the problem that they’d given me, they just wanted to see if I could come up with something, or how I’d react to an unsolvable problem. I found that these interviews were the easiest to prepare for: going through something like Princton’s Algorithms, Part 1 and Part 2 and doing some problems on leetcode.com should be enough to prepare for this.
Machine Learning interviews
Some interviews simply tested general ML knowledge. They covered things that a normal ML class at university would cover, too. These types of interviews were typically split into two parts. The first part being some general knowledge ML questions (how do you regularize a deep network? Where does boosting come up in Random Forest training? What would be two suitable classification algorithms when prediction speed is more important than accuracy? How would you go about grouping documents into semantic groups by their content? Can you discuss the connection between Gaussian Mixture Models and k-means?). There was often a second part that consisted of “ML coding”, where I had to implement some standard ML algorithm. For example, I remember implementing inference/pruning of a decision tree, k-means and k-NN. I was typically given ~30-45 minutes to implement these (and again, talk about efficiency and maybe possible test cases).
“So we have this problem…”
In some interviews, I was just told about a project the interviewer is currently working on, and then they would pick my brain on how to solve this (“we’re trying to find duplicate videos in our video data base”, “we need to rank these millions of entities according to some vaguely specified criterion, and do so with sub-second latency”, “we have very little labelled data and want to use GANs to augment our data set, what’s the best way?”, …). There’s no really good way to prepare for those, but I got the feeling that people mainly wanted to get a feel for my thinking process. So these interviews were less about coming up with the perfect algorithm, and more about brainstorming or discussing trade-offs, even if your initial idea was completely off.
Whenever I was told that I’d be be interviewed by some person in HR, what followed was almost always a Behavioral Interview. Luckily, the questions are almost always the same, so you can prepare for them well ahead of time. One company even sent me a booklet about their “company values” and told me I’d be interviewed about how I reflect those values in my daily life. Just google “Behavioral interviews” and you’ll find lots of resources. All in all, I didn’t see these questions too often, maybe 3 or 4 times during my whole job hunt.
Some companies had me go through some interviews that I’ve not encountered in any other company. Like math puzzles (this text is a good start), or a “paper discussion” interview where I was given a paper in advance to read through (typically from a research area I wasn’t very familiar with). Or a multiple-hour written exam on the fundamentals of statistics, probability and optimization theory :)
After my interviews, a recruiter at the company would be in touch, and give me “the good news”. I was always straight forward about interviewing at several places, and every recruiter was very accommodating and understanding that I would only be ready to discuss further steps once I had heard back from all companies. Then the salary negotiations started. A lot of ink has been spilled on that topic already. Definitely check the linked blog posts at the beginning of the text for more some general pointers.
Naturally, salary varies extremely with region. Salaries in ML track software engineering salaries pretty well, so levels.fyi or GlassDoor give a good idea on what kind of salaries to expect. Blind also has a lot of good information both about salaries and the general interview process, if you can handle the amount of toxicity there. However, the numbers you’ll find on those sites are heavily biased towards Silicon Valley, and America in general. I was able to find some European salaries on those sites, but I had to look for them. Even within Europe, there are large differences between countries: notably, UK and Switzerland pay much higher salaries than other countries. When companies asked me for my opinion on salary, I’d always tell them that 100k EUR/year was a good number to start negotiations with. Even before I started interviewing, I knew that this was realistic lower bar, considering what I knew from my research on the web, my prior internship and talks with friends and colleagues; There were definitely companies in the UK or Switzerland who would be willing to pay that much, though the number is fairly ambitious for other regions in Europe. Still, I figured that starting high was better than awkwardly skirting around the “I don’t want to give you a number” issue.
Most of the initial offers where were around ~80k - 120k EUR / year. These were total yearly compensation before taxes, so these offers includes base salary, expected bonus amount as well as any stock options or extra pension contributions. I floated the highest of these offers to all recruiters. Overall, I felt that US-based companies (i.e., the ones who’s center of operation is in the US) were pretty much unfazed by the number I gave and were willing to negotiate, while most of the European-based companies told me that they wouldn’t be able to compete with that offer. By the way: no European company ended up offering more than 100k EUR throughout the whole process. As a next step, I decided on which offers I thought were really worth following up on. Some companies made a bad impression during the interview phase, and others were merely a “last resort” option, which I wouldn’t need anymore. I thanked them for their time and told them I wasn’t interested any more. I was left a very small group of offers that I was very excited about: I would’ve been willing to take any one of these offers right on the spot! This left me in a very strong negotiating position: I could ask each company to make me a better offer than the current highest bidder, without being too careful about scaring anyone off. Even if someone would retract their offer (or was unwilling to meet my demands), I still had other great offers left. Things ran themselves: everyone outbid the others, driving my total yearly compensation into areas I’d never have dreamed off. It was a very surreal feeling.
I did maybe 2 rounds of back-and-forth between all the companies. I had the impression that the recruiters were getting desperate at that point: inviting me & my SO to all-expenses-paid weekend-trips to their city, sending me surprise gift baskets in the mail, handing me exploding offers, etc. I felt it was time to call it quits before they all lost their patience. It’s worth pointing out that while the companies did offer better relocation packages and sign-on bonuses, they were inflexible in offering additional non-monetary benefits. For example, no company was able to add more vacation days to their original offers (which was mostly dictated by regional standards, and varied from 25 to 45 payed workdays off per year). In the end, by leveraging competing offers this way, the final offer of the company I eventually signed with was double what their initial offer was. My yearly total compensation ended up in the 160-240k EUR range (not including sign-on or relocation bonuses).
A final word about recruiters
I’ve never seen a group of people relying on carrot and stick as relentlessly as recruiters. They’ll tell you “this offers is the best we can do” (only to update it once they’re outbid), emphasize that “they will make an exception for you because you’re such a great candidate” (while offering things that you already knew they’d offer to everyone), they’ll pull strict deadlines out of thin air (only to be completely open to postponing them 5 minutes later), they’ll tell you that they won’t renegotiate (only to renegotiate as soon as you bring them a higher competing offer), they’ll tell you for 2 months that they’re interviewing other candidates for the same position (but never pull the plug on the negotiation). It felt like the recruiters were always trying to determine if I was truly serious about picking another offer over theirs, or whether I was just using them to renegotiate with another company. I guess all of this is to be expected, and my best advice is to always be polite, patient, and persistent. In my limited experience, as long as you have multiple offers (and are willing to drop any one of them unless you get what you ask for), you hold all the power. So negotiate hard.