Description
Job Title/Grade : Site Reliability Engineer (Grade 9)
Reports to: Senior Director, DevOps
Department : IT
Work Location: United States, United Kingdom, Belgrade, Serbia, Lagos or Abuja, Nigeria
Date : April 15, 2022
Function of the Team:
Many of DAI’s legacy systems, including its project information management system used around the globe, are undergoing a strategic transition to new platforms that will support DAI’s strategy to become a data-enabled, cloud-first organization. This reflects the continuing technological change and our clients’ increased data collection, reporting, and integration demands. In identifying and transitioning to the Target State Platforms, we also need to recognize that DAI projects have specific needs and sometimes operate in low connectivity environments.
DAI’s Current Technology stack includes, but is not limited to, Oracle ERP, Azure Cloud, MS 365/.Net (including Dynamics, Power App, and Power BI), SharePoint, Lotus Notes, and a variety of EUCs (End User Computing).
Function of the Position:
The Site Reliability Engineer will be responsible for working with the various IT teams to ensure that DAI’s IT systems and services are functioning properly, and that when issues are identified, they are addressed in a timely manner. They also provide input into the roll out of new features, and adoption of new technologies.
Roles and Responsibilities:
- Set-up monitoring systems to detect problems before they become outages.
- Work with development and operations teams to design, build and maintain core infrastructure pieces that allow DAI scaling to support thousands of concurrent users.
- Perform root-cause-analysis and host (blameless) “postmortem” meetings for unanticipated service disruptions.
- Document every action so that findings turn into repeatable actions–and then into automation.
- Contribute to the evaluation and adoption of new technologies, which may include JavaScript/TypeScript, Python, C#, Kubernetes, etc.
- Debug production issues across services and levels of the stack.
Additional responsibilities as deemed necessary.
Qualifications:
Minimum Qualifications:
- Grade 9: Minimum of 7 years of relevant software development and/or site reliability engineering experience and a Bachelor’s degree in computer science or similar area; or 5 years of relevant software development and/or site reliability engineering experience and a Master’s degree in computer science or similar area.
- At least one year of demonstrated experience as a full-stack developer with hands on knowledge of languages like Java, Python etc. and exposure to application / infrastructure architecture;
- Knowledge of software development methodologies such as Waterfall, Agile, SAFE, Spiral, etc. (Experience with Agile is strongly preferred.)
- Experience collaborating cross-functionally on availability / performance issues in order to identify root-cause, determine areas for improvement, and drive those actions to closure through effective solutions.
- Good written and oral communication skills, with the ability to communicate complex or technical information clearly, and tailor communication style to diverse audiences.
- Able to work independently and as part of a team.
- Able to build and maintain strong working relationships with staff at all levels of the organization and external clients from diverse backgrounds.
- Authorized to work in the United States, United Kingdom, Serbia, Nigeria without sponsorship.
- Willing and able to adjust work schedule, if needed to support teams located in different time zones/countries.
Equivalent education and experience will be considered.
Preferred Qualifications:
- Experience with ERP type Systems (Oracle ERP, SAP, etc.).
- Experience analyzing existing SQL queries for performance improvements.
- Familiarity with DevOps processes, including CI/CD.
policies and regulations