Cloud Technical Lead

Location

Bengaluru

|

India

Work Mode

Commitment

Full-Time

No of Positions

1

Experience

12-14 years

Required Skills

ELK, Splunk, Elasticsearch, AppDynamics, Dynatrace, Solarwinds, Nagios, Grafana, Prometheus, BigPanda, ITIL

Job Description

  • 12+ years of experience with implementation, operations, maintenance of IT systems, and/or administration of software functions in multi-platform and multi-system environments.
  • Min 4-5 years of experience working on Cloud.
  • Should know Monitoring tools/platforms. Experience in ITIL Framework and Service Delivery.
  • Hands-on experience in the design and implementation of a monitoring solution/ framework for a complex setup.
  • Experience implementing and delivering monitoring solutions in development, QA, and Production environments.
  • Demonstrate competence in shell scripting and high-level programming languages. Strong focus on Python.
  • Previous experience defining, creating, and supporting monitoring dashboards.
  • Experience working across departments evangelizing and communicating observability expertise and standards.
  • Possess practical knowledge and appreciation of various aspects of distributed service design, including messaging protocols, caching strategies, and autonomous software design practices.
  • Experience with monitoring and observability tools and methodology of products such as; ELK, Splunk, Elasticsearch, AppDynamics, Dynatrace, Solar winds, Nagios, Grafana, Prometheus, Big Panda, Data dog, Site24*7, etc.
  • Strong understanding of the Open Systems Interconnection model (OSI model).
  • Solid understanding of performance metrics, KPIs, statistical calculations, machine learning, and correlation.
  • Ability to solve problems across the entire stack - operating systems (Linux/Unix/windows), software, application, and network.
  • Responsible for design, development, testing, and implementation of monitoring applications to meet business process and application requirements.
  • Good understanding of alert management and logs for IT systems – servers, storage, network, database, etc.
  • Should have solid experience using observability data to debug systems, reduce the frequency and length of production incidents, and provide a cohesive overall view of systems health.
  • Will build and maintain solutions for getting insights on infrastructure and services supporting applications with a focus on logs, metrics, and application traces that improve Observability.
  • Should think about the problem end-to-end: automation of data collection from common data sources, store data efficiently in the Application Performance managing and monitoring tool, render this information for the user based on the defined SLOs and SLIs and finally, focus on the actions, define and deliver activities on the monitoring roadmap.
  • Collaborate with operations & engineering teams, application developers, management, and infrastructure teams to assess near- and long-term monitoring needs.
  • Implement, maintain, and consult on the observability and monitoring framework that supports the needs of multiple internal stakeholders.
  • Keep an eye on the emerging observability tools, trends, and methodologies, and continuously enhance our existing systems and processes.
  • Participate in process development with our Engineering and Development teams.
  • Effectively communicate tool capabilities and processes to varying stakeholders.
  • Collaborate with other teams on improving our observability systems.
  • Assist with driving monitoring and observability standards to improve the consumer experience of mission-critical applications, services, and business processes with a strong focus on the end-to-end journey.
  • Assist in scheduling and hosting regular tool training sessions to better enable tool adoption and best practices.
  • Provide input on improving the global operating model for monitoring and observability services.
  • Fine-tune the monitoring solutions to give the right incidents to the IT services teams.
  • Troubleshoot performance and operation issues that arise with the monitoring platform.

How to Apply

Be part of a collaborative, fast-paced team at the forefront of innovation and technology advancements. Not only will you enjoy your work life at Relevance Lab, you’ll also have the opportunity to grow your skills and career. If you are passionate about driving results, we’d love to talk with you.

Resume *
Max file size 10MB.
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Only PDF files are allowed.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.