Site Reliability Engineer - Cloud Security
Herzliya, Tel Aviv, Israel | Engineering | Apr 04, 2022

As the world’s largest software company, Microsoft holds itself to high standards ensuring that our customers’ expectations are met. The role of the Site Reliability Engineer (SRE) in the Cloud Security Group is to help the team provide the highest level of availability, performance, cost, and supportability for our customers across their Azure cloud environments. You will be expected to confront real-world, large-scale challenges across some of the world’s most complex cloud deployments. We are passionate about enabling customers and team members to deliver agile, reliable, high-performance solutions at scale.  

We are looking for a team-player to help us optimize and protect the software and systems behind our internal and customer offerings, keeping an ever-watchful eye on their reliability, latency, performance, and capacity.  

More about us:,7340,L-3893833,00.html 


  • You will be part of a global team driving enormous scale and assuring operational excellence while gaining deep understanding of availability, performance, and security.  
  • Plan, deploy and maintain production infrastructure hosted on Azure.  
  • Design and develop automated solutions to support the production infrastructure.  
  • Evaluate and contribute to product & service design, help shape Site Reliability Engineering strategies, review specifications, design and improve upon core processes.  
  • Work closely with peer engineering teams on defining and implementing improvements to service monitoring and reporting to enhance reliability and availability.  
  • Solve problems in mission critical services and create automated solutions to prevent problem recurrence.  
  • Provide operational support for day-to-day activities involving deployment of services, configuration of service interaction, etc.  
  • Participate and enhance our data-driven culture by providing statistical trends and analysis using real service data to increase service health and quality.  
  • Conduct periodic on-call duties.  


You will collaborate closely with multiple teams across Microsoft to deliver key customer solutions and the technology to support them.  

- Engage in and improve end-to-end lifecycle of services from inception and design, through deployment, operation, and refinement.  

- Analyze complex system behavior, performance, and application issues. 

- Apply modern software engineering practices to streamline deployments, drive down costs and operational overhead while meeting critical reliability and availability KPIs. 

- Work hand in hand with engineering teams to offer guidance and education on integration, testing, monitoring, and security across different technology stacks.  


We believe in enabling our team members to unlock their highest potential and invest a lot in building a collaborative and agile culture to achieve our business success. Join us and be part of our path to the next level of awesomeness!! 


Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.


Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.


Required Qualifications:   

  • 3+ years of software development: automation-related experience particularly valued. Scripting languages such as PowerShell, Python, etc. or other languages such as C#, C++ or Java are most relevant, but others are acceptable.  
  • 3+ years of Software Engineering and experience in testing, deploying and supporting large scale services on Azure, AWS or similar environments.  
  • Experience with distributed systems, networking, capacity planning and system design.  
  • Excellent testing and troubleshooting skills.  
  • Expertise in problem solving and analyzing critical production service environments.  


Preferred Qualifications:  

  • Deep understanding of complex, large scale online services (preferably Azure based)  
  • Experience with Build systems and CI/CD tools (Jenkins, Azure DevOps, etc.)  
  • Experience defining and measuring internal and customer facing OLA/SLA  
  • Good understanding of programing languages, operating systems and software development methods.