HACKER Q&A
📣 linusb

Tips on Sysadmin Job


Recently I got a promotion at my job. I was working with another guy in a small company with an even smaller IT department. That guy was fired and I got the position to be the only sysadmin at the moment. I never had any experience working and managing the whole IT infrastructure, because the company has 4 campuses around the city. I'm kinda confused on how to manage everything, my background is with information security and it's going to be a big challenge for me because it's my first job.


  👤 300 Accepted Answer ✓
Here's what to do in the first few weeks:

1. Require not 1, but 2 people, for hire, right now. It will take a long time until it gets approved, HR does their thing, and you (or someone else) hire people to help you.

2. Document, as fast as you can, all of the assets in your environment. Don't waste time on looking for software. Plain text or spreadsheet will do for start.

3. Identify the most important assets (servers, network devices, etc) - and try to understand what are your priorities for each. For example, availability for your frontend servers. Confidentiality and integrity for your server storing PII. Based on this, you'll understand what are your biggest risks. Would it be a business disaster if your website is down for few hours? Likely not. What about if the PII is compromised and you get huge fines and be in the news? More likely.

4. Find out who has the access to the most important assets, and cut off anyone who doesn't really need it.

5. Establish some sort of monitoring for the most important assets.


👤 Intermernet
Okay, 3 rules for sysadmin positions:

- The most important rule is that your recovery from backup procedure is working properly. - The second most important rule is that your backups are working properly. - The (distant) third most important rule is to always individually backup any text file you edit before you edit it.

Seriously. These are the most important rules.

Other pro-tips are to test any changes to remote access (SSH most importantly) before you close the session that you made the changes in. Also, schedule restore from system backup and restart before you make any potentially crippling changes to the system configuration. If the changes work then you can cancel the restore / restart. If the changes cripple the system it will (if you've followed rules one and two above) recover itself. Finally, if you can afford it, try to run at least one dev / test instance of each critical system. You can make changes, point a few test clients at the new instance, make sure it's working, then either make the changes to the prod system, or do the dev / prod swap-over.

The actual ways to achieve these rules varies depending on OS mix, hardware, cloud providers (if any), use of certain tech (are you using virtualization? Are you using containers? Are you using some form of orchestration tooling? etc.) and many other things.

The first two rules mentioned above, and the order in which they're mentioned, are by far the most important rules in sysadmin. Ignore them at your own peril ;-)

Good luck. Keep a level head, don't panic, and test your backups!


👤 loriverkutya
This is not a promotion, this is making you to do something you have no experience with and making you do the job what the fired person done besides yours. Also being responsible for the whole IT infrastructure alone is definitely what I would not do without any experience.

If that’s how the company handles this, I would start to look for a new job immediately.


👤 yk
Run!

Or at least go to your manager and make very clear that it is simply not possible to provide anything like 24/7 incident response when you are alone. That is the kind of conversation, where it is just better to get fired rather than not getting your point across.

The basic structure of the admin job is, that there is always more to do and it is very easy to burn out. There is always more to do, it is always important, but trying to chase some ideal of a perfect system will just exhaust you. Furthermore, it is not a big problem to check your monitoring before going to bed or a few times on the weekend, but that limits your rest periods to a few hours at best, you no longer get a full weekend, and that will grind you down over time. So, manage your time and sanity, and importantly also manage the expectations of your boss and co workers, there is only so much you can do when you're alone.


👤 me_me_me
Advice I heard from lab manager with 15yrs on job.

1. When everything works everyone is happy and nobody pays you any attention.

2. When something is not working you will get shit from everyone for 'not doing your job'.

You will not be able to completely avoid 2. So your only chance to balance the scales is by showing your hard work when in case 1.

That means auto generating graphs of network resources, disk space, creating trend reports with actionable recommendations to our manager/boss.

This way they have visibility into IT world instead of it being magic that just works. It also involves them when making decisions no matter how small.

This of course is job security measure and you need to do your job first. Goodluck.


👤 28304283409234
Initial thoughts of 20 year veteran:

- Create a single-source-of-truth you can automate against. Do that now. Use ansible to fetch and create an overview of what your landscape actually is running: which software, which versions. Find out what is going EOL, as this will bite you soon enough.

- don't automate anything else until you understand the full context of what you are automating. Automation is abstraction, and you don't know yet what to abstract.

- When you do automate, start with writing tests. https://testinfra.readthedocs.io/en/latest/ is quick win to ensure all your systems have certain configuration, can reach certain ports, have closed other ports.

- Once you have tests in place you can start creating ansible playbooks for changes, as changes _will_ happen.


👤 hardwaresofton
It seems no one suggested this but maybe you could try getting in touch with the person who got fired? Buy them the beverage of their choice and figure out what happened (and if you’re next in X years) let them know the pickle you’re in (and that you don’t blame them of course), and ask for some tips if they are willing to give you some.

You worked together, surely there was some rapport there?


👤 shireboy
I’m a dev who got sucked into some ops/sysadmin work. Not a whole department, but I think my advice would still apply. The most effective thing I did was a custom script to test everything I was responsible for and send me a nice formatted email w the results. I’m not talking about canned email from Helpdesk or network monitor, but tests like “is the backup file modified date the expected value”, “is disk space on X at least N% free”, “are Z services up”? All in a single email, every morning. I wrote mine in powershell, but you do you.

If something bad happens, add a test for it to that email. It’s not an end all, but helps be proactive and keep pulse of things you are on the hook for. Real-time alerts etc can be useful, but also a lot of noise.

Also a kanban board like Trello or Notion has been useful


👤 linusb
Op here ~~ Thank you all for the advice, I've been reading all of your comments and my heart and mind feels more light now. I would like to share some skills I do have:

Programming background in C and python, also bash and powershell programming.

Knowledge of networking and computers in general( I know how to fix hardware in general and know how to deal with Linux and Windows)

My background is in InfoSec, so I Know how to create scripts in general, analyse and mitigate vulnerabilities.

It's my first job, so I'm really anxious on how to do stuff, because I have some theoretical knowledge but never did it in a company, I always did it at home working on some personal labs.

Thank you again, I really appreciate your comments. Love you all


👤 Ciantic
Sysadmin of four campuses sounds like a job that requires more than one sysadmin. Being a sysadmin is pretty much like herding the sheep, it's 24/7 job.

Something breaks in christmas eve, you have to fix it. Something breaks in the friday evening you have to fix it. This is a job that needs multiple people in long run usually.


👤 mnw21cam
My advice would be prioritisation. In decreasing order: 1. Make sure intruders can't misuse your data/resources. 2. Make sure your data doesn't accidentally go missing or get corrupted. 3. Make sure the the resources are sufficiently set up for other people to get their jobs done.

So, with a security background you have the most important of these already. The next most important is handled with backups. Then the rest is less critical.

Next, you need to decide whether the system as it currently stands is basically working but needs little adjustments, or needs redesigning from scratch. Take a few months of observations before you decide this.

Document the issues that you deal with. Make notes on the nature of the problem, the priority, the expected effort required to deal with it, what happened, how much effort it actually took. An issue tracker is a good way to do this, and I'd recommend using one even if it's a small company.


👤 dsr_
The components of this job may or may not include:

- desktop (human!) support

- networking inside buildings

- networking between buildings

- server maintenance

- OS administration

- authentication systems

- local services (DNS and NTP are the most important, but your end users will think of file sharing and printing as most important)

- external services

- security (information, yes, but possibly physical systems, too)

- backups (nobody cares about backups, they only care about restores)

Figure out which of these are currently running, then which are your responsibility. Document everything, and save copies as automatically as possible - you have four sites? You want five copies. One in each location and one in a secure off-site location.

Make lists. Review them in the morning and before you go home.

Get a copy of Limoncelli and Hogan's book, The Practice of Systems Administration, and the follow-up Time Managment for Systems Administrators. Expense them.

Later, figure out costs.

Good luck.


👤 vnuk
As your first move I'd suggest start pushing managment for a new hire. Four locations and only one person managing them? That doesn't sound feasible in the long run and it's definitely not fair to you.

👤 j_not_j
My 32-years of experience suggests:

1. Don't do anything that you cannot undo. Like, copy the /etc/passwd before running vi on it. Or yelling at a user. Or upgrading software or operating system -- have backups.

2. Have the trust and authority of your manager. This needs to be negotiated in advance. Warn management of your steep learning curve and warn them to expect outages. Ensure you can tell users to wait because you have a priority list established with your manager. Remind users, if necessary, that they are not your manager.

3. Make a note, even if just by counting, every user request. For 100 users you might expect 50 to 200 requests per month. Be sure management sees this report. It is evidence to support additional hiring.

4. Learn customer service skills. Be able to stay completely calm in the face of the CEO screaming in your face. This requires a lot of "emotional IQ" and probably training and coaching. If you can achieve this level of zen-consciousness just 80% of the time you win.

All the other advice posted here also applies. The trick is to adapt the advice to the situation you face, and, when you make a mistake, quickly reverse course.

Good luck!


👤 yabones
Just want to say, good luck OP. I was in your position about 3 years ago, though with a smaller company at the time. It was the hardest, most stressful experience in my career so far but I've come out the other side with immeasurably greater confidence and skills. This will be the most brutal learning experience of your life. But, if you need to throw in the towel, there's no shame in that at all.

👤 bayindirh
A lot of good advice is already thrown around. Let me add another tip:

Always do smoke testing. After setting up a system, and it starts working, reboot once. No preparation, nothing. Just reboot out of the blue.

If the system boots and works as normal, you're done. If not, fix the problems and retry.

Will save a lot of headaches down the line.

Also, document all your procedures. Everything. To something tidy & local. Something like TiddlyWiki or a tool of your choice.


👤 trabant00
Start top down: ask what has the most business value and take care of that first. You will not be able to do everything an experienced sysadmin will do, so you need to prioritize to the extreme. Only deal with important stuff.

The most technical aspect: don't lose or destroy the data! Downtime is fine, losing data is catastrophic.

Grow a thick skin or move on. You will be the target of complaints. A lot.


👤 roamerz
Understand and document what you are responsible for including storage,hosts,networking.

Document the dependancies of apps on the above.

Verify backups and that they are backing up the right information and that they are immutable until their lifecycle is complete.

Verify your critical infrastructure has two power sources. Most equipment have two power supplies and I make sure they are independently powered. (UPS and line/generator). Do not rely on just UPS because it will fail you.

Make sure your datacenter has redundant A/C. Cooling failure will kill you as certainly as power failure just more slowly.

Know your facilities contacts especially the after hours emergency numbers and make sure they are current.

Know your IT support contacts. Sysadmins are somewhat jack of all trades master of none. Call the masters and use them when needed. No shame at all in that.

Don’t be a dick. Relationships are important.

Above that go out of your way help people even if it’s not in your wheelhouse.

Be aware of your long term stress level and well being. Your responsibility is to your people and you can’t do that effectively if your are not mentally able.

Communicate with your boss and make sure you are getting the support and tools you need to do your job. If you need it ask for it.

KISS - Don’t be sucked into complexity where it’s not warranted. Simpler is more reliable than complex if you don’t understand the complex.

Don’t automate unless you completely understand the process you are automating and all the dependancies.

Don’t offhand just trust vendors or their promises. Verify.

Pay extra for good hardware.

Read HN

In no particular order.


👤 hullsean
I’ve been doing the above as a consultant for a couple of decades. Here’s what I see:

o As others have said it’s a lot. That said it sounds like management trusts uou. More than hiring someone off the street to take the reigns. This means there is room to breathe and room to fail a little bit.

o Talk to all biz units to learn what services they rely on. Thus will not be 1:1 with servers and applications as you see them. But it’s an important starting point.

o Inventory all the systems that you can see/find.

o identify backups, DR and put together a list of your concerns. Document this and share with management. This will give you cover.

o develop your own priority list. Be prepared for management to give you a different set of priorities. You will need to learn skills of push back and compromise.

o learn to reach “good enough” in the short term. If you try to fix everything elegantly and perfectly other problems will wait longer.


👤 lordnacho
So you're the only person in charge? Ask for stuff that people in charge have: more money, and staff.

I'm not saying this because I think you need more money and staff, you haven't mentioned either.

The reason you need to do this is to check what kind of organization you're working for. Do they value your presence? If they don't make a move on either of these issues, you know the answer and you can leave immediately. The job market is white hot right now, you don't need them. Either they're dumb and they don't realize they need you, or they're cheap and they think they can get you cheaply. They could also be reasonable and act like it.

Basically it's poker and you've got a decent hand.

Oh yeah, and don't let them do the "we're working on it" thing. Immediate raise, and some kind of job ad in a public place, now.


👤 codingdave
Is this a corporate job, or do you work for a school district or some other kind of entity?

I ask because much of the advice here is assuming the corporate world - and if that is true, it is all mostly valid advice. But what you are describing is fairly typical for IT in small government environments (schools, libraries, etc.)

So I'll give a little advice in case you are not in a corporate world -- let your boss know you are in a bit over your head and need some help with learning and training to fill in knowledge gaps. In non-corporate orgs, this is completely normal - the IT folks there work what they know, and ask for help when they don't know. They also tend to network with each other - maybe the guy who runs the library in the next county over has some different skills, and you can cross-train each other, etc.


👤 geocrasher
I'd be questioning why somebody without the proper qualifications was promoted to such a position. Have you heard of the Peter Principal? If not, go look it up. That's not a slight.

I once got "promoted" by way of the superior admin being let go. He'd set up the place so well that it maintained itself (to a point) and when he had a medical event that took him out a few weeks, things ran so well that they figured he wasn't needed anymore.

They stuck me in his position, which I was not ready for. Things got bad for me, and they tried to make it look like I was the bad guy when I quit a few months later.

Perhaps your circumstances are different, but you need to be 100% sure they're investing in you, not setting you up to be the fall guy.


👤 emreb
What a great opportunity to learn on the job! Here are my few tips: 1 - Do not shy away from asking for help 2 - Google everything, and never be satisfied with the first answer you see 3 - Try to automate everything you find yourself doing for the 3rd time in a month.

👤 majkinetor
There is only one tip for you: if you are not programming, you are going to suck.

I have never met very good sysadmin which wasn't programmer at one point.

I am not talking about hard core programming involving serious patterns, DI, migrations etc. But most of the time, you should be writing good, resilient, scripts in more serious shell language such as PowerShell. If you don't do that, your actions are not reproducible and thus your output is prone to interpretations (i.e. it sucks). Plus, there is no such thing in enterprise as one-time-tasks - if you don't script your task you will have to manually repeat it sooner or later and that is extremely inefficient and error prone.



👤 andresgaitan
3 year Linux Sysadmin, here my 2c:

Buy ASAP both "UNIX and Linux System Administration Handbook" and "Practice of System and Network Administration".

UNIX and Linux System Administration Handbook -> Will give you the necessary tech skills and concepts. This is a sysadmin bootcamp.

Practice of System and Network Administration -> Will show you how to the job right. This is more like 'how to manage' an operations team.

Each book is massive and will take you 1 month to read. I suggest you start with UNIX and Linux System Administration Handbook.


👤 computershit
Adding to the list of what's already been said:

1. Passwords / Secrets management. Ensure there are no shared accounts in use, no credentials or passwords on the company wiki. Implement adoption of a password manager (LastPass if commercial, BitWarden if you can selfhost) for team and individual secrets.

2. Identify all public-facing endpoints and do an initial once over on the software that's backing them and any vulnerabilities (especially if they have anything Atlassian in their stack).


👤 technion
- What's your ticket system like? If people can just drop by your desk and ask about their printer that's something you'll want to change

- I know with your security background when Bob in accounts demands a domain admin password you won't just give it to him. But who has your back on that? Will he cry to the CEO who in turn will ask you to please do your job and give it to him? Get some relevant policies reviewed and signed off.


👤 readingnews
I have been an IT sysadmin or data center admin or sysadmin+sysadmin manager for almost 30 years. I have worked at small places and huge campuses... I will throw my $0.02 in, which may not be worth 1-bit.

You said: > my background is with information security and it's going to be a big challenge for me because it's my first job.

I see a lot of good advice for people who kinda-sorta know what they are doing. You did not say if this was a linux-based job or a windows-based job. This makes a small difference. If this was a windows-only shop, my advice would change slightly (e.g. Do you understand windows automated deployment with a system like SCCM or whatever you have in-house? Do you know powershell? From your comment, I think the answer is "no".) If this was a linux-only shop, again my advice would change (Do you know how to write any code in any language? From your comment, again, I think the answer is "no".)

You say you are confused about how to manage everything, I am assuming you do not mean "I have a huge pile of IT inventory" and what you mean is "I have 200+ different pieces of software/applications running, how to keep track of all of them, their status, upgrades, needs, etc."

Every sysadmin I ever worked with that was worth their weight was at one time a programmer, CS major or CS graduate. I do not know if you have the time to learn programming in this position, so I would ask the following question:

Is it your desire to be a sysadmin as a career or do you want to do Info Sec as a career?

If you want to be a sysadmin, you need to learn some programming. Pick something that fits in with what kind of shop you are in (win/linux/mixed). If you do not want to be a sysadmin, figure out only how to keep the ship afloat, and start looking for an InfoSec job. You will eventually dislike the sysadmin position. It takes a special kind of person to automate everything they do, and build things only to tear them apart later (https://www.sciencedirect.com/science/article/abs/pii/S01672...).

Most people in here have given great advice, but IMHO I am not sure if it fits in with your skill level. I do not mean any insult, but I knew InfoSec people who were expert sysadmins, and others who knew nothing about the job.


👤 malicebird
When you do inventory, check for everything that has an expiration. Domains, SSLs, 3rd party software licenses/services...anything that can cause an outage because something wasn't paid or expired.

👤 quartesixte
We don’t happen to work at the same place do we. . . ?

👤 haolez
The first thing is to ask for a staff to be hired. You need to use the momentum of your own hiring to get this new staff approved.

👤 artie_effim
Codify an incident response plan - nothing beats a playbook vs chicken-with-no-head response.