Titlu Job: QA Engineer
Responsibilities:
- Responsible for the overall health, performance, and capacity of gaming platform services
- Monitor and manage the gaming platform to ensure SLAs are met
- Build and manage systems, infrastructure and applications through automation
- To spend 40-60% of whole time working on Ops related activities. Remaining time should be spent writing code and building systems to improve performance and operational efficiency
- Develop strategy, processes, and shape our existing infrastructure and support procedures
- Regularly check code into our continuous integration pipeline
- Participate in periodic on-call duties
- Work with other DevOps and SRE’s.
What you'll be doing :
- Serve as a primary point responsible for the overall health, performance, and capacity of gaming platform services.
- Troubleshoot issues across the entire stack: hardware, software, application and network. Physical hardware and cloud-based environments.
- Gain deep application-level knowledge of the systems as well as contributing to their overall design and drive standardization efforts across multiple disciplines and services
- Identify and drive opportunities to improve automation for the company (continuous delivery)
- Manage timely resolution of all critical and/or complex problems meeting SLA requirements
- Participate in a 24x7 on-call rotation
- Ability to effectively communicate with all levels of management and all stakeholders
- Develop, configure and optimize service and application monitoring and telemetry
- Assist in the roll-outs and deployment of new product features and installations
- Develop tools to improve our ability to rapidly deploy and effectively monitor applications and services in a large-scale environment
- Work closely with development teams to ensure that platforms are designed with "operability" in mind.
- Function well in a fast-paced, rapidly-changing environment.
Who you are :
- A technology or business graduate degree, or equivalent experience and knowledge of IT governance and operations.
- Strong knowledge of current IT methodologies and systems technologies and standards.
- Always keeps IT security in mind in whatever you do.
- Actively contributes SRE/DevOps best practices.
Passion to replace manual work with code that can enable a system to run itself, i.e. mindset of arround the clock “eating”, “breathing”, and “sleeping” automation of everything. - Hands on experience including, but not limited to: Excellent ability to script
- Proven successful track record of running a multi-node/ multi-tier web application
Architectures:
- Experience with configuration management tool
- A high level of knowledge of Java. Strong coding skills expected (i.e. desire and ability to write quality code)
- Familiarity with maintaining long-lived APIs
- Experience in monitoring, reporting and alerting using industry leading tools
- Test and build systems such as TeamCity, Jenkins, Maven, Ant.
- Experience with cloud computing platforms and services such as Mesosphere DCOS, OpenStack, & AWS
- Senior level experience supporting Linux and DB systems
- Experience working with virtualization software
- Proficient with TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures
- Strong communication, negotiation, conflict resolution skills and ability to tackle a problem to completion. Desire and ability to wear many hats (developer, engineer, specialist, troubleshooter,support, tester, inventor)
- Strong analytical and decision-making ability
- Systems thinking - the ability to see how parts interact with the whole (big picture thinking)
- Practical knowledge of various aspects of service design, including messaging