Event

Doctoral Defence: Marcelo Luis RUIZ RODRÍGUEZ

The Doctoral School in Sciences and Engineering is happy to invite you to Marcelo Luis RUIZ RODRÍGUEZ’s defence entitled

Maintenance Optimization in Industry 4.0: A Deep Reinforcement Learning Approach to Sustainable
Policy Development.

Supervisor: Prof Yves LE TRAON

Effective maintenance planning and scheduling are essential for manufacturing companies to prevent machine breakdowns and maximize uptime and production. Furthermore, these policies must be in alignment with the principles of environmental integrity and social responsibility. The development of sustainable policies presents several challenges. These include the need to balance economic, environmental, and social aspects, as well as to address uncertainties such as unexpected failures and variable time-to-repair. This thesis, conducted in partnership with Cebi, an electromechanical component design and manufacturing company, addresses the challenge of developing sustainable maintenance policies in the face of uncertainty.

To this end, we propose a Deep Reinforcement Learning (DRL)-based approach for predictive maintenance, which we compare with traditional maintenance policies such as corrective, preventive, and condition-based maintenance, and evaluate against diverse methods based on metaheuristics and rule-based approaches.

As a first contribution, we conducted a study of the impact of categorized levels of uncertainty in a manufacturing environment on the failure distribution and time to repair for maintenance policies. We evaluated the performance of DRL, genetic algorithm-based simheuristic (GA-S), and rule-based (DR) decision-making systems in terms of mean time to repair (MTTR), machine uptime, and computational efficiency. The work was conducted in simulated scenarios with different levels of uncertainty and also considering a real manufacturing use case. The experimental results show that DRL shows exceptional adaptability to reduce MTTR, especially in the face of high uncertainty. GA-S outperforms DRL and DR in terms of total machine uptime, but not in terms of MTTR, when configured with high re-optimization frequencies (i.e., hourly re-optimization), but rapidly underperforms as the re-optimization frequency decreases. Furthermore, our study shows that GA-S is computationally expensive compared to DRL and DR policies.

As a second contribution, we proposed to tackle the problem of maintenance scheduling on multi-component identical parallel machines performed by technicians. In this work, we proposed a multi-agent DRL approach to learn the maintenance policy under the uncertainty of multiple machine failures. This approach includes DRL agents that partially observe the state of each machine to coordinate the decision-making in maintenance scheduling, resulting in the dynamic assignment of maintenance tasks to technicians (with different skills) over a set of machines. Experimental evaluation shows that our DRL-based maintenance policy outperforms classical maintenance policies such as Corrective Maintenance (CM) and Preventive Maintenance in terms of failure prevention and downtime, improving overall performance by approximately 75%.

In our last contribution, we proposed to optimize maintenance scheduling from an economic perspective (considering maintenance, breakdown, and downtime costs), an environmental perspective (considering the carbon footprint produced during production) and a social perspective (considering the fatigue experienced by technicians during maintenance activities). We propose an evolutionary multi-objective multi-agent deep Q-learning (EvoDQN) based approach, where multiple agents explore the preference space to maximize the hypervolume of these sustainable objectives. The results show the trade-offs between these objectives when compared to traditional maintenance policies such as Condition-Based Maintenance and CM, as well as different deep Q-network policies trained with various preferences and a higher number of agents for our EvoDQN approach. Finally, our approach evaluated the production cycle, in which our method demonstrated superior performance, resulting in increased profitability within the system.