100 Asset Integrity jobs in Saudi Arabia

Reliability Engineer

Saudi Aramco

Posted 12 days ago

Job Viewed

Tap Again To Close

Job Description

Aramco occupies a unique position in the global energy industry. We are the world's largest producer of hydrocarbons (oil and gas), with the lowest upstream carbon intensity of any major producer.

With our significant investment in technology and infrastructure, we strive to maximize the value of the energy we produce for the world along with a commitment to enhance Aramco’s value to society.

Headquartered in the Kingdom of Saudi Arabia, and with offices around the world, we combine market discipline with a generations’ spanning view of the future, born of our nine decades experience as responsible stewards of the Kingdom’s vast hydrocarbon resources. This responsibility has driven us to deliver significant societal and economic benefits to not just the Kingdom, but also to a vast number of communities, economies, and countries that rely on the vital and reliable energy that we supply.

We are one of the most profitable companies in the world, as well as amongst the top five global companies by market capitalization.

Reliability Solutions Department (RSD) provides engineering support to Saudi Aramco Global Manufacturing wholly owned and affiliates’ facilities (Refining, NGL Processing, and Petrochemicals). This includes reliability, inspection, corrosion management, static equipment, rotating equipment, electrical equipment, instrumentation and control support.

Your primary role as a Reliability Engineer is to support Global Manufacturing portfolio to achieve the highest level of reliability for plants’ assets through a systematic approach and coordinated efforts. You will also administer a holistic system to benchmark, identify gaps and deploy comprehensive solutions to enhance asset reliability.

Key Responsibilities

As the successful candidate you will be required to perform the following:

  • Collaborate with organization management as well as with facilities management to drive reliability initiatives and programs systematically.
  • Design, build and drive strategies to enhance reliability performance in order to minimize downtime and increase availability.
  • Promote and enhance reliability culture in the organization and portfolio facilities.
  • Manage an initiative from the inception of the idea to the deployment of the full program.
  • Oversee a system that proactively identifies and addresses potential issues/ threats that could impact normal operation or equipment reliability.
  • Own the availability enhancement cycle ensuring that major reliability/ availability events are properly reported, tracked, investigated and measures are taken to avoid reoccurrence.
  • Serve as a consultant to the different organization entities concerning reliability matters.
  • Communicate effectively with all stakeholders to provide updates on the progress of reliability initiatives, related challenges as well as the roadmap for future programs.
  • Ensure alignment and adherence to corporate standards, processes and guidelines.
  • Collaborate with corporate entities on reliability initiatives and programs to ensure full alignment.
  • Establish reliability KPIs and related KPIs structure that different levels of the organization need to oversee.
  • Analyze reliability metrics and identify areas of improvement and provide cost effective recommendations to uplift the overall reliability of the portfolios’ plants.
  • Develop requirements for training, certifications and professional development opportunities for reliability professionals and related fields.
  • Work with the different organizations and entities to develop digital solutions and tools concerning reliability.

As a successful candidate you will hold a:

  • Bachelor’s degree in Engineering /advanced degree in reliability is preferable.
  • 15 years of experience in reliability engineering in the Oil and Gas Industry’s downstream sector. Twenty years is preferable.
  • Previous experience in managing and driving reliability projects.
  • Demonstrate knowledge in the use of various reliability and Root Cause Analysis (RCA)/ Investigation software.
  • Black-belt Six Sigma certification is a plus.
  • Demonstrate strong leadership qualities and team building skills to drive and liaise with people at all levels.
  • Excellent oral and written communication skills in English.
  • Demonstrate an in-depth knowledge in change management and problem-solving techniques.

Working environment

Our high-performing employees are drawn by the challenging and rewarding professional, technical and industrial opportunities we offer, and are remunerated accordingly.

At Aramco, our people work on truly world-scale projects, supported by investment in capital and technology that is second to none. And because, as a global energy company, we are faced with addressing some of the world’s biggest technical, logistical and environmental challenges, we invest heavily in talent development.

We have a proud history of educating and training our workforce over many decades. Employees at all levels are encouraged to improve their sector-specific knowledge and competencies through our workforce development programs – one of the largest in the world.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Reliability Engineer

Saudi Aramco

Posted today

Job Viewed

Tap Again To Close

Job Description

Aramco occupies a unique position in the global energy industry. We are the world's largest producer of hydrocarbons (oil and gas), with the lowest upstream carbon intensity of any major producer.

With our significant investment in technology and infrastructure, we strive to maximize the value of the energy we produce for the world along with a commitment to enhance Aramco’s value to society.

Headquartered in the Kingdom of Saudi Arabia, and with offices around the world, we combine market discipline with a generations’ spanning view of the future, born of our nine decades experience as responsible stewards of the Kingdom’s vast hydrocarbon resources. This responsibility has driven us to deliver significant societal and economic benefits to not just the Kingdom, but also to a vast number of communities, economies, and countries that rely on the vital and reliable energy that we supply.

We are one of the most profitable companies in the world, as well as amongst the top five global companies by market capitalization.

Reliability Solutions Department (RSD) provides engineering support to Saudi Aramco Global Manufacturing wholly owned and affiliates’ facilities (Refining, NGL Processing, and Petrochemicals). This includes reliability, inspection, corrosion management, static equipment, rotating equipment, electrical equipment, instrumentation and control support.

Your primary role as a Reliability Engineer is to support Global Manufacturing portfolio to achieve the highest level of reliability for plants’ assets through a systematic approach and coordinated efforts. You will also administer a holistic system to benchmark, identify gaps and deploy comprehensive solutions to enhance asset reliability.

Key Responsibilities

As the successful candidate you will be required to perform the following:

  • Collaborate with organization management as well as with facilities management to drive reliability initiatives and programs systematically.
  • Design, build and drive strategies to enhance reliability performance in order to minimize downtime and increase availability.
  • Promote and enhance reliability culture in the organization and portfolio facilities.
  • Manage an initiative from the inception of the idea to the deployment of the full program.
  • Oversee a system that proactively identifies and addresses potential issues/ threats that could impact normal operation or equipment reliability.
  • Own the availability enhancement cycle ensuring that major reliability/ availability events are properly reported, tracked, investigated and measures are taken to avoid reoccurrence.
  • Serve as a consultant to the different organization entities concerning reliability matters.
  • Communicate effectively with all stakeholders to provide updates on the progress of reliability initiatives, related challenges as well as the roadmap for future programs.
  • Ensure alignment and adherence to corporate standards, processes and guidelines.
  • Collaborate with corporate entities on reliability initiatives and programs to ensure full alignment.
  • Establish reliability KPIs and related KPIs structure that different levels of the organization need to oversee.
  • Analyze reliability metrics and identify areas of improvement and provide cost effective recommendations to uplift the overall reliability of the portfolios’ plants.
  • Develop requirements for training, certifications and professional development opportunities for reliability professionals and related fields.
  • Work with the different organizations and entities to develop digital solutions and tools concerning reliability.

As a successful candidate you will hold a:

  • Bachelor’s degree in Engineering /advanced degree in reliability is preferable.
  • 15 years of experience in reliability engineering in the Oil and Gas Industry’s downstream sector. Twenty years is preferable.
  • Previous experience in managing and driving reliability projects.
  • Demonstrate knowledge in the use of various reliability and Root Cause Analysis (RCA)/ Investigation software.
  • Black-belt Six Sigma certification is a plus.
  • Demonstrate strong leadership qualities and team building skills to drive and liaise with people at all levels.
  • Excellent oral and written communication skills in English.
  • Demonstrate an in-depth knowledge in change management and problem-solving techniques.

Working environment

Our high-performing employees are drawn by the challenging and rewarding professional, technical and industrial opportunities we offer, and are remunerated accordingly.

At Aramco, our people work on truly world-scale projects, supported by investment in capital and technology that is second to none. And because, as a global energy company, we are faced with addressing some of the world’s biggest technical, logistical and environmental challenges, we invest heavily in talent development.

We have a proud history of educating and training our workforce over many decades. Employees at all levels are encouraged to improve their sector-specific knowledge and competencies through our workforce development programs – one of the largest in the world.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Reliability Engineer

Baker Hughes

Posted today

Job Viewed

Tap Again To Close

Job Description

**Are you passionate about being part of a successful team?**

**Would you like to be part of a unique blend of innovative technology, advisory services and decades of reliability engineering experience?**

**Join our Baker Hughes Bently Nevada team**

ARMS Reliability, part of Baker Hughes Bently Nevada is a leading global provider of reliability solutions, supporting some of the world’s largest resource, power and utility companies. Using innovative technology, advisory services and decades of reliability engineering experience, we are transforming the way companies manage the reliability of their assets.

**Partner with the best**

As a Reliability Engineer you will perform engineering services for customers to meet profit, quality and customer satisfaction targets

As a Reliability Engineer you will be responsible for:

- Competently facilitating problem solving sessions.
- Executing Work Processes for allocated Projects
- Ensuring that projects meet budget, schedule and quality outcomes
- Managing project budgets, timelines & deliverables where they fall into your areas of responsibility
- Estimating project steps and resources
- Liaising with client to ensure timely communications, regular project status and proactive issue handling
- Maintaining project logs, ensuring accurate reporting of time and expenses for the entire project to support the company’s invoicing and payroll processes

**Fuel your passion**

**To be successful in this role you will**:

- Have a Degree-qualified in an Engineering discipline
- Have at least 5 years’ experience in a similar role in maintenance / reliability within the mining, processing, oil & gas, utilities or manufacturing industries
- Have experience in data analysis including extraction from various data sources the ability to summarize data into conclusions
- Have experience in leading a project team and have a working knowledge of the project elements including timelines, deliverables and budgets
- Have experience working with Failure Mode Effect Analysis (FMEA)
- Have experience in Reliability Centered Maintenance (RCM)
- Have experience in Master data & CMMS experience

**Work in a way that works for you**

We recognize that everyone is different and that the way in which people want to work and deliver at their best is different for everyone too. In this role, we can offer the following flexible working patterns:

- Working flexible hours - flexing the times when you work in the day to help you fit everything in and work when you are the most productive

**Working with us**

Our people are at the heart of what we do at Baker Hughes. We know we are better when all of our people are developed, engaged and able to bring their whole authentic selves to work. We invest in the health and well-being of our workforce, train and reward talent and develop leaders at all levels to bring out the best in each other.

**Working for you**

Our inventions have revolutionized energy for over a century. But to keep going forward tomorrow, we know we have to push the boundaries today. We prioritize rewarding those who embrace change with a package that reflects how much we value their input. Join us, and you can expect:

- Contemporary work-life balance policies and wellbeing activities
- Comprehensive private medical care options
- Safety net of life insurance and disability programs
- Tailored financial programs
- Additional elected or voluntary benefits

**About Us**:
We are an energy technology company that provides solutions to energy and industrial customers worldwide. Built on a century of experience and conducting business in over 120 countries, our innovative technologies and services are taking energy forward - making it safer, cleaner and more efficient for people and the planet.

**Join Us**:
Are you seeking an opportunity to make a real difference in a company that values innovation and progress? Join us and become part of a team of people who will challenge and inspire you! Let’s come together and take energy forward.

Baker Hughes Company is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, national or ethnic origin, sex, sexual orientation, gender identity or expression, age, disability, protected veteran status or other characteristics protected by law.
This advertiser has chosen not to accept applicants from your region.

Reliability Engineer

MM Management

Posted today

Job Viewed

Tap Again To Close

Job Description

**RCM Engineer - Mechanical**
- Project Management & provision of asset Integrity Management services which include RCM, ECA, ACA, RBI, RAM, FMEA, RCA) for the Oil & Gas, Petro-chemical, Power & Process industry.
- Development of Asset Register, equipment hierarchy and functional location in line with ISO-14224.
- Equipment Criticality Analysis (ECA) by using Company Standard Guidelines to identify the safety critical, High, Medium & Low critical equipment.

**Salary**: ﷼9,000.00 - ﷼12,000.00 per month
This advertiser has chosen not to accept applicants from your region.

Cloud Reliability Engineer

Marc Ellis

Posted 12 days ago

Job Viewed

Tap Again To Close

Job Description

  • Design, implement, and maintain highly available and scalable cloud architectures on GCP and OCI.
  • Develop and implement best practices for cloud infrastructure reliability, including monitoring, alerting, and incident response.
  • Collaborate with cross-functional teams to design and implement automated solutions for infrastructure provisioning, configuration management, and deployment.
  • Conduct performance analysis, capacity planning, and optimization to ensure efficient resource utilization and cost-effectiveness.
  • Implement and maintain security best practices and compliance standards across GCP and OCI environments.
  • Troubleshoot complex issues related to cloud infrastructure, network, and application performance.
  • Participate in on-call rotation and respond to incidents in a timely manner to minimize service disruptions.
  • Stay current with the latest trends, tools, and technologies in cloud computing and reliability engineering.
#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Regional Reliability Engineer

Flowserve Corporation

Posted today

Job Viewed

Tap Again To Close

Job Description

Company Overview:
If a culture of excellence, innovation and ownership is what you’re searching for, consider putting your experience in motion at Flowserve. As an individual contributor, or as a leader of people, your enterprise mindset will ensure Flowserve’s position as the global standard in comprehensive flow control solutions. Here, your opportunity for professional development and industry leading rewards will be supported by our foundational commitments to the values of people first, integrity and safety. Thinking beyond opportunity and reward, at Flowserve, we are inspired by working together to create extraordinary flow control solutions to make the world better for everyone!

**Job Summary**:
Flowserve is looking for a Regional Reliability Engineer to support key Customers in the Middle East and Africa. In this function you will analyze common causes of failure and define reliability improvement strategies to extend equipment MTBR. Furthermore, using your technical expertise with pumps and mechanical seals you will perform RCFA’s on the more complex engineering problems before developing appropriate solutions that could range from technical upgrades to training programs.

You will work directly with Customers, visiting their facilities as necessary to perform your function. You will also work closely with Application Engineers, Sales, and other specialist departments within Flowserve.

**Responsibilities & Requirements**:

- Achieve contractual targets for equipment MTBR at the assigned Customer sites.
- Define reliability improvement strategies for Life Cycle Advantage agreements
- Perform RCFA’s on complex equipment failures within pumps and mechanical seals
- Create technical upgrades to solve Bad Actor equipment problems.
- Provide technical support to Customers and Flowserve Application Engineers
- Role is based in Dammam (KSA) but must be willing to travel within the region when required.

**Experience / Skills**:

- Previous experience as a Reliability Engineer or other similar role involving rotating equipment. Must have some experience of pumps or mechanical seals. Alternatively, a previous role as a Sales or Applications Engineer specializing in pumps and/or mechanical seals would also be desirable.
- Good data analytics skills with a high proficiency in MS Excel
- Experience performing Root Cause Failure Analysis investigations.
- Strong technical acumen and analytical thinking.
- Excellent communication skills
- Degree or equivalent in relevant field and 4+ years relevant experience.

**Req ID** : R-1011

EOE including Disability/Protected Veterans. Flowserve will also not discriminate against an applicant or employee for inquiring about, discussing or disclosing their pay or, in certain circumstances, the pay of their co-workers. Pay Transparency Nondiscrimination Provision
This advertiser has chosen not to accept applicants from your region.

Senior Site Reliability Engineer

Riyadh, Riyadh HALA

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description


HALA is a leading fintech player in the MENAP region that aims to redefine financial services and build the future bank of SMEs. HALA aims at empowering SMEs to start, run, and grow their businesses by providing them with cutting-edge financial and technological tools.


HALA currently holds multiple entities in UAE, Saudi Arabia and Egypt (including HALA Payments, HALA Cashier and HALA Logistics) and offers solutions that enable merchants to digitize their payments as well as manage their sales and operations.


Founded in 2017, HALA is currently duly licensed by the Saudi Arabian Central Bank as well as the Financials Services Regulatory Authority (FSRA) in Abu Dhabi Global Market.

Responsibilities:

  • Comply with the HALA’s code of conduct and ethics
  • Promote the HALA’s vision, mission, values and model desired behaviors
  • Promote HALA and spread its culture
  • Commit to HALA’s rules and regulations
  • Perform tasks as directed in the pursuit of the achievement of organizational goals
  • Share with team know-how and encourage their development

Job Specific:

  • Run the cloud environment by monitoring availability and taking a holistic view of system health
  • Build software and systems to manage platform infrastructure and applications
  • Improve reliability, quality, and time-to-market of our suite of software solutions
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Provide primary operational support and engineering for multiple large, distributed software applications
  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
  • Partner with development teams to improve services through rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning
  • Create sustainable systems and services through automation and uplifts
  • Balance feature development speed and reliability with well-defined service level objectives
  • Deploy updates and fixes
  • Build tools to reduce occurrences of errors and improve customer experience
  • Perform root cause analysis for production errors
  • Investigate and resolve technical issues
  • Design procedures for system troubleshooting and maintenance

Education:

Bachelor’s degree in computer science, information technology, or equivalent field of studies

The education levels can be replaced by years of experience

Experience:

5-7 years of experience in a similar position (SRE, DevOps, or infrastructure engineer).

Skills:

  • Computer Skills: Advanced in Microsoft Office Tools
  • Languages: Fluent in English and Arabic
  • Advanced knowledge of compliance and regulations
  • Experience with Kubernetes administration.
  • Experience with infrastructure as code tools such as Terraform and Ansible.
  • Experience with at least one of the major cloud providers: AWS, GCP, Azure, or OCI.
  • Experience with architecting, developing, and troubleshooting large-scale systems.
  • Experience building CI/CD pipelines (preferably GitOps).
  • Experience with monitoring and observability tools such as Prometheus, Loki, Jaeger, and Sentry.
  • Experience in managing databases including (backup and restore plans, replication, and clustering) such as PostgresSQL, and MongoDB.
  • Good networking knowledge (preferably experience with VPNs and Service Mesh)
What We Offer You
We believe you will love working at HALA!
  • We have an inclusive and diverse culture that encourages innovation and flexibility in remote, in-office, and hybrid work setups.
  • We offer highly competitive compensation packages, including the potential for shares.
  • We prioritize personal development and offer regular training and an annual learning stipend to tackle new challenges and grow your career in a hyper-growth environment.
  • Join a talented team of over 30 nationalities working in 7 countries and gain valuable experience in an exciting industry.
  • We offer autonomy, mentoring, and challenging goals that create incredible opportunities for both you and the company.
  • You will be given a lot of responsibility and trust.We believe that the best results come when the people responsible for a function are given the freedom to do what they think is best.
If you think you have what it takes to join a remarkable team #apply_now

Create a Job Alert

Interested in building your career at HALA? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

First Name *

Last Name *

Email *

Phone

Resume/CV

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

LinkedIn Profile

Website

What is your current salary? *

What is your expected salary? *

Are you Saudi? * Select.

What is your nationality? * Select.

What is your notice period? * Select.

Are you living in Riyadh? * Select.

When is your available times for an interview? * Select.

Have you held any leadership positions?If yes, kindly, describe *

If you held any leadership positions, kindly mention, how do you motivate your team? *

What do you know about Hala?And why you want to work for Hala? *

Do you have +4 years experience in this field * Select.

Kindly, write down professional certificates you have *

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Asset integrity Jobs in Saudi Arabia !

Staff Site Reliability Engineer

Foodics

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

Who Are We

We are Foodics! A leading restaurant management ecosystem and payment tech provider. Founded in 2014 with headquarters in Riyadh and offices across 5 countries, including UAE, Egypt, Jordan, and Kuwait. We serve customers and partners in over 35 countries worldwide. Our innovative products have processed over 6 billion orders, making Foodics one of the most rapidly evolving SaaS companies from the MENA region. Foodics has achieved three funding rounds, with the latest raising $170 million in the largest SaaS funding round in MENA, enhancing our capabilities to serve business owners better.

The Job in a Nutshell

We are seeking a Staff Site Reliability Engineer (SRE) to join our high-impact engineering team. You will ensure the scalability, performance, and reliability of Foodics’ cloud-native platforms. Your role involves designing, implementing, and evolving infrastructure solutions and operational processes supporting millions of transactions daily, while promoting best practices in observability, incident management, and resilience engineering.

What Will You Do

  • Design and maintain scalable, highly available, and fault-tolerant systems across cloud providers (AWS, OCI).
  • Lead incident response efforts, conduct blameless post-mortems, and drive improvements.
  • Build and refine automated deployment pipelines for safe and repeatable changes.
  • Implement observability frameworks (metrics, tracing, logging) to detect and resolve performance issues proactively.
  • Collaborate with development teams to embed reliability into the software lifecycle.
  • Optimize infrastructure costs while maintaining service quality.
  • Drive chaos engineering experiments to validate system resilience.
  • Document architecture, runbooks, and operational processes for internal and cross-team use.

What Are We Looking For

We seek a reliability-focused engineer with strong technical skills, experienced in solving operational challenges at scale. You should be hands-on with distributed systems, cloud-native platforms, and automation tools.

  • Strong knowledge of SRE principles (SLIs, SLOs, SLAs) and operational excellence.
  • Experience with Kubernetes, container orchestration, and service mesh technologies.
  • Expertise in infrastructure as code (Terraform, Ansible, Crossplane is optional) and scripting (Bash, Python, Go).
  • Deep understanding of monitoring and alerting systems (Prometheus / Grafana, ELK, Loki, Datadog, AWS CloudWatch).
  • Skills in cloud networking, load balancing, API gateways (NGINX, Kong, AWS API Gateway).
  • Experience with relational and NoSQL databases (MySQL, PostgreSQL, MongoDB, DocumentDB, Redis).
  • Familiarity with distributed tracing (Jaeger, OpenTelemetry) and chaos testing frameworks.
  • Excellent troubleshooting skills, capable of resolving high-impact incidents under pressure.

Who Will Excel

  • Candidates with experience operating high-traffic, mission-critical cloud-native platforms.
  • Those demonstrating strong collaboration and communication skills across teams.
  • Individuals with a data-driven approach to performance tuning and capacity planning.
  • Candidates thriving in fast-paced, high-growth SaaS environments and committed to continuous improvement.

What We Offer You

We believe you will love working at Foodics!

  • Competitive compensation packages, including bonuses and potential equity.
  • Annual learning stipend and regular training opportunities.
  • Exposure to cutting-edge cloud technologies and distributed systems.
  • A diverse, global team of over 30 nationalities in 14 countries.
  • Autonomy, challenging goals, and the opportunity to impact platform reliability serving millions.
#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Staff Site Reliability Engineer

Riyadh, Riyadh Foodics

Posted 9 days ago

Job Viewed

Tap Again To Close

Job Description

Who Are We
We Are Foodics! a leading restaurant management ecosystem and payment tech provider. Founded in 2014 with headquarter in Riyadh and offices across 5 countries, including UAE, Egypt, Jordan and Kuwait. We are currently serving customers and partners in over 35 different countries worldwide. Our innovative products have successfully processed over 6 billion (yes, billion with a B) orders so far! making Foodics one of the most rapidly evolving SaaS companies to ever emerge from the MENA region. Also Foodics has achieved three rounds of funding, with the latest raising $170 million in the largest SaaS funding round in MENA, boosting its innovation capabilities to better serve business owners.

The Job in a Nutshell

We are seeking a Staff Site Reliability Engineer (SRE) to join our high-impact engineering team. In this role, you will be responsible for ensuring the scalability, performance, and reliability of Foodics’ cloud-native platforms and services. You will design, implement, and evolve infrastructure solutions and operational processes that support millions of transactions daily, while championing best practices in observability, incident management, and resilience engineering. Your expertise will help us maintain world-class uptime and seamless customer experiences as we continue to grow at scale.

What Will You Do

  • Design and maintain scalable, highly available, and fault-tolerant systems across multiple cloud providers (AWS, OCI).
  • Lead incident response efforts, conducting blameless post-mortems and driving systemic improvements.
  • Build and refine automated deployment pipelines, ensuring fast, safe, and repeatable delivery of changes.
  • Implement robust observability frameworks (metrics, tracing, logging) to proactively detect and address performance issues.
  • Collaborate with development teams to embed reliability into every stage of the software lifecycle.
  • Optimize infrastructure costs while maintaining service quality.
  • Drive chaos engineering experiments to validate system resilience.
  • Document architecture, runbooks, and operational processes for internal and cross-team use.

What Are We Looking For

We’re looking for a reliability-focused engineer with strong technical depth, who thrives in solving complex operational challenges at scale. You must be hands-on with distributed systems, cloud-native platforms, and automation tools.

  • Strong background in SRE principles (SLIs, SLOs, SLAs) and operational excellence.
  • Experience with Kubernetes, container orchestration, and service mesh technologies.
  • Proven expertise in infrastructure as code (Terraform, Ansible, Crossplane is optional) and automation scripting (Bash, Python, Go).
  • Deep understanding of monitoring and alerting systems (Prometheus/Grafana, ELK, Loki, Datadog, AWS CloudWatch).
  • Skilled in cloud networking, load balancing, API gateway management (NGINX, Kong, AWS API GW).
  • Solid experience with relational and NoSQL databases in production (MySQL/PostgreSQL, MongoDB, DocumentDB, Redis).
  • Familiarity with distributed tracing (Jaeger, OpenTelemetry) and chaos testing frameworks.
  • Excellent troubleshooting skills and ability to resolve high-impact incidents under pressure.

Who Will Excel

  • Candidates who successfully operated high-traffic, mission-critical platforms in a cloud-native environment.
  • Candidates that demonstrate strong collaboration and communication skills across engineering, product, and business teams.
  • Candidates who bring a data-driven approach to performance tuning and capacity planning.
  • Candidates that thrive in fast-paced, high-growth SaaS environments and embraces continuous improvement.

What We Offer You
We believe you will love working at Foodics!

  • Highly competitive compensation packages, including bonuses and potential equity.
  • Annual learning stipend and regular training to accelerate your career.
  • Exposure to cutting-edge cloud technologies and large-scale distributed systems.
  • A truly global team of over 30 nationalities in 14 countries.
  • Autonomy, challenging goals, and the chance to directly impact the reliability of platforms serving millions.
#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Senior Site Reliability Engineer

Canonical

Posted 9 days ago

Job Viewed

Tap Again To Close

Job Description

workfromhome

Join to apply for the Senior Site Reliability Engineer role at Canonical

2 days ago Be among the first 25 applicants

Join to apply for the Senior Site Reliability Engineer role at Canonical

Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers, and industry leaders in many sectors. The company is a pioneer of global distributed collaboration, with 1200+ colleagues in 75+ countries and very few office based roles. Teams meet two to four times yearly in person, in interesting locations around the world, to align on strategy and execution.

The company is founder led, profitable and growing.

We are hiring a Senior Site Reliability Engineer

Next-gen operations at scale, with pure Python infra-as-code, from bare metal to containers and applications. Our goal is to perfect enterprise infrastructure devops.

We run hundreds of private cloud, Kubernetes, and application clusters for customers across physical and public cloud estate, and we are raising the bar on what's possible with automation by embracing a universal operator pattern and model-driven operations.

To succeed in this role you need to believe in automation as a pure software engineering problem, not a hack-it-till-it-works-for-me problem. You need to be interested in the scientific approach to operations at scale, driven by metrics and code, and you need to be able to learn the entire stack, from bare metal networking and kernel up to serverless and open source applications.

Location: Globally remote role

The role entails

Our cloud operations engineers bring Python software-engineering skills and rigour to the operations domain. We practise devsecops from bare metal to application. We architect and run OpenStack, Kubernetes and software defined storage, and we enable devsecops for applications running on that infrastructure too.

To become a member of this team, you need to be a software engineer fluent in Python, you need a genuine interest in the full open source infrastructure stack from metal to containers, and you need the ability to work in a high pressure operations environment with mission-critical services for global brand name customers.

As a member of the team you will gain experience in a broad range of cloud technologies. We evolve our offerings as the state of the art improves, so you get to stay current with the latest capabilities in open source infrastructure. We drive upgrades to keep our customers on the latest, best solutions.

What we are looking for in you

  • Degree in Software Engineering or Computer Science
  • Experience with Linux and familiarity with Linux networking and storage
  • Python software development expertise
  • Operational experience
  • Excellent interpersonal skills, curiosity, flexibility, and accountability
  • Ability to travel internationally twice a year, for company events up to two weeks long

Nice-to-have skills

  • Experience with OpenStack or Kubernetes deployment or operations
  • Familiarity with public or private cloud management

What we offer colleagues

We consider geographical location, experience, and performance in shaping compensation worldwide. We revisit compensation annually (and more often for graduates and associates) to ensure we recognise outstanding performance. In addition to base pay, we offer a performance-driven annual bonus or commission. We provide all team members with additional benefits, which reflect our values and ideals. We balance our programs to meet local needs and ensure fairness globally.

  • Distributed work environment with twice-yearly team sprints in person
  • Personal learning and development budget of USD 2,000 per year
  • Annual compensation review
  • Recognition rewards
  • Annual holiday leave
  • Maternity and paternity leave
  • Employee Assistance Programme
  • Opportunity to travel to new locations to meet colleagues
  • Priority Pass, and travel upgrades for long haul company events

About Canonical

Canonical is a pioneering tech firm at the forefront of the global move to open source. As the company that publishes Ubuntu, one of the most important open source projects and the platform for AI, IoT and the cloud, we are changing the world of software. We recruit on a global basis and set a very high standard for people joining the company. We expect excellence - in order to succeed, we need to be the best at what we do. Most colleagues at Canonical have worked from home since its inception in 2004. Working here is a step into the future, and will challenge you to think differently, work smarter, learn new skills, and raise your game.

Canonical is an equal opportunity employer

We are proud to foster a workplace free from discrimination. Diversity of experience, perspectives, and background create a better work environment and better products. Whatever your identity, we will give your application fair consideration.

Seniority level
  • Seniority level Mid-Senior level
Employment type
  • Employment type Full-time
Job function
  • Job function Engineering and Information Technology
  • Industries Software Development

Referrals increase your chances of interviewing at Canonical by 2x

Get notified about new Senior Site Reliability Engineer jobs in Riyadh, Riyadh, Saudi Arabia .

Junior Software Engineer - Cross-platform C++ - Multipass Software Engineer (Python/Linux/Packaging) Software Engineer - Cross-platform C++ - Multipass System Software Engineer - GCC/LLVM compiler, tooling, and ecosystem Software Engineer - Python - Container Images Distributed Systems Software Engineer, Python / Go Software Engineer - Python - Container Images Software Engineer - Python - Container Images Python and Kubernetes Software Engineer - Data, AI/ML & Analytics Software Engineer - Immutable Ubuntu Desktop Senior Software Engineer - Python/MongoDB

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Asset Integrity Jobs