The Babylonian World Map is considered the earliest map of the world dating back to 600 BC. To create the maps early cartographers, the science of making maps and images showing geographic information, would sketch on clay tablets the various crevices and soil textures to depict the Earth’s surface.
Thousands of years later clay tablets were replaced by paper and more recently iPads, but the task of sketching remains the same, particularly for geologists, who are less interested in mapping the world and more interested in finding new sources of natural energy in the Earth’s crust.
To compute the distance between two images, we used the Euclidean distance between the LBP vectors.
For example, to identify potential hydrocarbon accumulations (natural energy), geologists study seismic facies which reflect the parameters and textures – such as configuration, continuity, amplitude, and frequency – within rock layers (strata) of a depositional sequence. It’s common for them to use fairly simple sketches to communicate these concepts at different stages of their workflow.
During the interpretation process, a common task is to look for specific combinations of seismic patterns that may indicate geological structures or anomalies caused by fluid presence. Seismic interpretation remains a very human-centered process and it’s one of the most important bottlenecks in the oil and gas exploration phase, but recently, the field is showing an interest in speeding up this process with machine learning. While early research has been promising, as with many novel applications of artificial intelligence, there is a lack of large, high-quality annotated data sets to use.
Synthetic data is one solution to address the data shortage, but it’s expensive. Our team at IBM Research in Brazil offers another option which relies on Generative Adversarial Networks (GANs) and hand sketches to generate realistic synthetic seismic images. To the best of our knowledge, it’s the first work to propose and develop such a concept for this domain. We will be presenting it today at the annual conference for the European Association of Geoscientists and Engineers (EAGE) in London.
There are several benefits of such tool such as enabling geoscientists to rapidly create and communicate many what-if scenarios, using synthetic seismic images to scan seismic cubes looking for similar structures and generating training data for supervised machine learning algorithms.
GANs have received a lot of attention recently due to their use for creating deep fakes of photos and videos of famous figures in popular culture and the news, but in our research we are using the technique to synthesize the hand drawn sketches to create realistic seismic images.
To train and test our approach we simulated the sketches made by geoscientists using image processing techniques. Five networks were trained, one for each sketch type, using the same configuration. The networks were trained for 25 epochs, considering 34,000 samples taken at random.
Fig 5: Test subset 1 from crossline 1116 and respective input sketches (b-f) and synthetic seismic images (h-l). (m) Test subset 2 from crossline 1196 and respective input sketches (n-r) and synthetic seismic images (t-x).
To evaluate the results of the networks trained for each sketch type we performed a quantitative and a qualitative evaluation. For the first, we took the 750 test seismic images synthesized by each network and compared them with the original seismic images. Our best result achieved a median distance of 0.17 (see table I).
For the qualitative analysis we analyzed the performance of the proposed sketch types/networks in terms of image similarity and we conducted a visual analysis of the quality/consistency of the synthesized seismic images (see figure 5).
In conclusion, we evaluated five different sketch types and showed that it was possible to obtain very realistic seismic images with rather simple sketches. The combination of background colors – representing rock layers – and colorful edges was the sketch type that produced the best results.
For the next step we would like to investigate the performance of the proposed methodology in different applications such as image retrieval and classification. For example, we could use our technique to help experts find similar datasets among terabytes of data using a simple sketch, but also for search in documents, peer-reviewed papers and reports.
Data augmentation is one of the leading methods to tackle the problem of few-shot learning, but current synthesis approaches only address the scenario of a single label per image, when in reality real life images may contain multiple objects. The IBM team came up with a novel technique for synthesizing samples with multiple labels.