Cloud Operations and Support Engineer - Remote Opportunity
Join our team as a Cloud Operations and Support Engineer and play a key role in driving the success of our cloud operations! We are seeking a highly skilled and motivated individual to fill this position at our remote office. As a Cloud Operations and Support Engineer, you will be responsible for providing top-notch maintenance and support activities for our agency-level cloud environments, including Azure and AWS. In return for your expertise, you will receive a competitive salary and the opportunity to work with a talented team.
The Health and Human Services Delivery Center Cloud Operations Center and Support Engineer will be responsible for:
- Providing primary contact for all agency cloud operations-related queries and issues
- Offering after-hours support to ensure seamless operation of agency cloud services
- Creating, modifying, and deleting cloud alerts to monitor system performance
- Monitoring application workloads to ensure optimal performance
- Proactively detecting problems, managing events, and handling notifications and escalations
- Managing Major Incident Management (MIM) for agency priority 1 incidents
- Implementing automated remediation for recurring incidents
- Updating agency hosting and design documents as needed
- Managing activity documentation and approval chains for both agency-specific and enterprise activities
- Resolving agency cloud security alerts, ensuring compliance with security requirements, and monitoring certificate expiration and renewals
- Implementing agency security controls according to organizational standards
- Performing agency patch management on cluster environments, including Azure Kubernetes Service clusters (AKS) and Kubernetes versions
- Developing and monitoring automated advanced sequences
- Conducting file-level restorations as needed
- Identifying cost-saving opportunities, monitoring, remediating excessive resource expenditures, and escalating cost-related issues
- Implementing billing and cost management tags for better resource allocation
- Maintaining IT Service Management Knowledge Base portal for reporting and investigation
- Developing detailed management procedural manuals for each agency
To be successful in this role, you will need:
- To pass an extensive background check
- 3-5 years of experience as a System Engineer and/or Cloud Engineer with hands-on experience dealing with implementation, security, and standards/best practices in a cloud environment, including Azure and AWS
- In-depth knowledge of networking as well as connectivity to AWS and/or Azure (via Direct Connect and/or ExpressRoute)
- Hands-on experience with Microsoft Azure and Amazon Web Services (AWS) cloud services
- Administrator certifications in Azure and/or AWS (Preferred)
- Strong working knowledge of Azure and/or AWS
- Ability to work closely with multiple diverse delivery centers and their agencies while anticipating their needs and exceeding their expectations
- Strong organizational, communication, change management, and problem-solving skills
- Strong knowledge of all cloud technologies and the ability to keep abreast and deep technical understanding of current and emerging technologies
- Proven operational experience with large Enterprise environments
- Excellent communication skills, both oral and written, to clearly communicate with clients
We Encourage You to Apply!
Even if you feel you're not a perfect match, we'd still love to hear from you. We are looking for great people to join our friendly team.
Apply for this job