Introduction
Many industries rely heavily on non-destructive testing (NDT) and inspection data to ensure the safety of their assets and operations. A data lake, which is a centralized repository that allows you to store all your structured and unstructured data at any scale, could be a solution for storing and managing this type of data [1]. Having a date lake allows for the centralization of all NDT data and inspection metadata cost-effectively — the storage of large amounts of data comes at a fraction of the cost of traditional storage methods as it eliminates the need for X-ray film, chemicals, paper, or archive rooms. It also reduces pathways because digital data, unlike physical data, can be accessed from almost everywhere.
Artificial Intelligence Projects
One of the main benefits of storing and managing NDT data and inspection metadata in a data lake is that it is a valuable source for artificial intelligence (AI) projects. The large amount of data stored in a data lake can be used to train machine learning models, which can then be used to improve the efficiency of NDT and inspection processes.
One example of how data from a data lake can be used for AI in the petrochemical industry is in the development of predictive maintenance models. Using historical NDT data and inspection metadata, machine learning models can be trained to predict when equipment is likely to fail, allowing organizations to schedule maintenance and repairs proactively, reducing downtime, and increasing the overall efficiency of their operations.
Another example is the use of AI-based image analysis to improve the accuracy and efficiency of radiographic test evaluation (e.g., to determine the residual wall thickness of pipelines or to check for erosion or corrosion). Machine learning models can be trained with historical inspection images to identify defects and anomalies, allowing for the automation of the inspection process, thereby reducing the workload on human inspectors and increasing the overall accuracy of the inspection process.
Data Lake Challenges
Data lakes also come with some challenges: they can be complex to set up and manage, requiring a certain level of technical expertise and specialized tools and resources. Additionally, data lakes can be difficult to secure and require proper data governance and management to ensure data accuracy, consistency, and completeness.
Preventing “Data Swamps”
Although a data lake allows for storing large amounts of raw data in its original format, data standardization is an aspect to consider when implementing a data lake for NDT data and inspection metadata in the petrochemical industry. Since the data formats usually vary based on the different NDT device manufacturers, the data cannot be sent directly to the AI and can end up creating a so-called “data swamp.”
Standardizing the data within the data lake can have a lot of benefits, including better data quality, governance and reusability, increased agility, greater efficiency, and integrity across multiple systems.
ASTM International, formerly known as the American Society for Testing and Materials recommends considering a standardized digital file format Digital Imaging and Communication for Non-Destructive Evaluation (DICONDE), which enables organizations to store data from various NDT methods, such as ultrasonic and radiographic, as well as inspection data from visual inspections and other sources, in a centralized location. DICONDE also ensures data is complete, locatable, unaltered, and has a traceable history.
DICONDE is an open standard for displaying, transmitting, and storing images and digital data from industrial materials testing. It allows signals and images to be exchanged and displayed between different DICONDE-compliant systems. Through this, DICONDE provides a vendor-independent data storage and transmission protocol for non-destructive materials testing. DICONDE is developed by Subcommittee E07.11 of ASTM International, a global standards organization.[2]
Conclusion
In conclusion, data lakes are a powerful solution for storing and managing NDT data and inspection metadata in the petrochemical industry. They allow for the centralization of large amounts of data that can be used as a valuable source of data for AI projects, allowing organizations to improve the efficiency of their operations through automated defect recognition and predictive maintenance. Knowing that data lakes also come with some challenges, it might be reasonable to delegate its set-up to a trustworthy software company.
References
- AWS, “What is a data lake?” Amazon Web Services, https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/.
- DICONDE, https://de.wikipedia.org/w/index.php?title=DICONDE
Comments and Discussion
There are no comments yet.
Add a Comment
Please log in or register to participate in comments and discussions.