Job description
This is an excellent opportunity to join a diverse team in a flexible work environment.
Organizational Status
Work Performed
- Administer Linux servers running ELT-related software (e.g., JupyterHub, Docker) that are vital to the Department’s courses
- Develop, debug, and make improvements to open-source ELT software written in Python
- Monitor ELT applications and associated hardware systems for issues, and tackle these in a timely manner when they arise
- Diagnose and troubleshoot problems arising in ZFS (FreeNAS), RedHat/Centos Kernel-based virtual machines (KVM), automation tools such as Kickstart and Ansible
- Diagnose and troubleshoot issues related to server hardware
- Provide comprehensive guidance to instructors who use ELT platforms; including helping instructors scope systems needs given course content and class size, and verify correct functioning
- Provide frontline support to instructor queries and issues regarding their ELT platforms
- Work together with the others in the IT team to resolve ELT issues
- Transition student work to long-term storage to preserve records after courses end; and remove any old records that can be discarded
- Tear down and clean up ELT platforms and servers as necessary
- Conduct testing of customized software to ensure platforms and systems perform as expected
- Provide technical expertise, training, and consultation to members of our instructional teams and junior staff
- Establish and maintain documentation related to the management of ELT software and systems
- Maintain appropriate professional designations and up-to-date knowledge of current information technology techniques, programming languages, and tools
- Contribute to and lead other projects as required
Consequence of Error/Judgement
Supervision Received
Supervision Given
Minimum Qualifications
Undergraduate degree in a relevant discipline. Minimum of three years of related experience, or the equivalent combination of education and experience.
Preferred Qualifications
- Ability to work fluently with, and troubleshoot, servers running Linux-based operating systems. In depth knowledge of Linux and other POSIX compliant systems is strongly encouraged
- Ability to diagnose and resolve application and performance issues by examining application and system logs and operating system instrumentation (i.e., knowledge of the /proc filesystem and the ability to use tools like sar to diagnose load issues and bottlenecks)
- Ability to troubleshoot technical problems in a timely manner in an IT team, including the ability to identify root causes, ask relevant questions, look for data that helps to identify and differentiate the symptoms and root causes of complex issues
- Ability to suggest remedies that meet the needs of the situation and those directly affected, or refer issues to other members of the IT team appropriately
- Ability to read, understand, troubleshoot, and make improvements to software written in Python
- Knowledge and working experience with Docker, JupyterHub, Jupyter notebooks, JSON, Canvas, HTTP REST APIs, and common Python libraries
- Knowledge and working experience using Git and GitHub to maintain version-controlled source code repositories in a team, including making pull requests to improve external open-source software
- Familiarity with R and RStudio is an asset
- Communication skills to build documentation of common errors and symptoms to build institutional troubleshooting knowledge
- Ability to adapt to changing priorities, manage multiple tasks and meet deadlines
- Aptitude for learning new technology, platforms, and processes
- Willingness to respect diverse perspectives, including perspectives in conflict with one’s own
- Demonstrates a commitment to enhancing one’s own awareness, knowledge, and skills related to equity, diversity, and inclusion
About University of British Columbia
CEO: Arvind Gupta
Revenue: $2 to $5 billion (USD)
Size: 10000+ Employees
Type: College / University
Website: www.ubc.ca
Year Founded: 1915