Interview: Bloomfield Author Limoncelli Discusses Book on Cloud System Administration

Photo: Author Tom Limoncelli Photo Credit: Courtesy Tom Limoncelli
Author Tom Limoncelli | Courtesy Tom Limoncelli recently interviewed Tom Limoncelli, co-author of the new book “The Practice of Cloud System Administration: Designing and Operating Large Distributed Systems.” You can learn about it here. Limoncelli blogs at Everything SysAdmin. He can be found on Twitter @Yesthattom.

What is your IT background?

Tom Limoncelli (TL): In a word, variety. I’ve had technical and management roles. Most of my work has been with Linux/UNIX, with occasional stints in networking and security. Earlier in my career I did mostly enterprise and desktop fleet management, but since 2001 my focus has been on the server side of things. I’ve worked at small and big companies, the largest being seven years at Bell Labs and seven years at Google. Now I work at Stack Exchange [New York], the home of and other Q-and-A websites, where we use a mixture of CentOS Linux and Microsoft Windows to deliver a very high-performing set of websites.

Where in New Jersey do you live? Did you go to college here, and if so, where?

TL: I’m totally a Jersey Boy. My family moved here from Connecticut when I was 4 years old. I grew up in Morristown, went to college at Drew University in Madison and have lived in central and northern New Jersey ever since. I now live in Bloomfield. I resisted the lure of riding the train into New York City every day, but when I got a job offer from Google in 2005, I couldn’t say no. Now I work at Stack Exchange … but my entire social/family life is in New Jersey.

You previously published the book “The Practice of System and Network Administration.” Why did you write a new book?

TL: The world has changed in 14 years! That book was from a time when system administrators had to provide help desk support, run servers and pull cables through walls to run network connections. Things are much more specialized now. Most system administrators run services,
not help desks. While the old book is still useful for legacy enterprise organizations, “The Practice of Cloud System Administration” is all new material and focuses on distributed computing [“cloud computing”] and service administration. This new book brings a DevOps/SRE [site reliability engineering] sensibility to the practice of system administration.

What new material do you cover?

TL: This is a book about the best practices in IT that have been invented in the past 10 to 15 years at companies like Google, Facebook and others. It shows how to apply these practices to big enterprises as well as tiny startups. The book has two parts: design and operations. Part one is about the design of large systems: general design, how to pick a platform and three big chapters on design elements used to make systems easier to manage, more reliable and faster.

Part two describes how to run such systems. We discuss how to organize a team and its work, the basics of DevOps and SRE philosophies and how to manage a software-delivery pipeline and perform upgrades on live systems. We cover best practices for managing on call, disaster preparedness, monitoring, capacity planning and other important topics.

We’re very excited to present a new feature of our book series: our operational-assessment system. The system is a series of assessments you can use to evaluate your operations and find areas for improvement. This will help people scientifically and continuously improve the quality of the service and its management.

Who should read the book and why?

TL: This book is for three groups of people. Junior system administrators that want to up their game will find it a useful way to understand the new paradigms that are becoming standard practice everywhere. Managers of IT teams that want to modernize their practices and adopt DevOps and SRE techniques that will make their services more reliable and reduce their team’s stress will find it helpful. It is also useful for developers. Lately more and more developers are being handed operational duties; this … book explains things at their level.

Who are your co-authors and what do they bring to the table?

TL: Strata Chalup has 25-plus years’ experience in Silicon Valley focusing on IT strategy, best practices and scalable infrastructures at firms including Apple, Sun, Cisco, McAfee and Palm. She brings a wealth of experience and anecdotes from the West Coast. This is the second book we’ve worked on together.

Christina Hogan has 20-plus years’ experience in system administration and network engineering from Silicon Valley, … Italy and Switzerland. This is the third book project we’ve done together. She brings an international perspective and a high bar for rigorous research, and we know she has good taste because her husband is from New Jersey. True story: I met him twice when I was in high school and he was at Rutgers.

What else should our readers know about the book?

TL: We put a lot of effort [into making] sure the book is fun to read and not another boring technology book. IT can be difficult, stressful work. It doesn’t have to be. We focus on the practices that make operations smooth, reduce stress and increase job satisfaction.

Sharing is caring!

2122 More posts in News category
Recommended for you
The Magic of Bell Labs — Celebrating Louis Brus, Winner of the 2023 Nobel Prize in Chemistry

Every time I walk into Nokia Bell Labs in Murray Hill, I feel the same....