Friday, March 5, 2021

Data Manager Bucket List

10 things you should do before becoming a Certified Petroleum Data Manager:

*  See real-time data from the field being used to update a 3D geomodel 
*  Be present for a wireline logging run 
*  Watch a seismic survey being acquired in the field
*  Compare the histograms for SEGY data loaded in IEEE and IBM format  
*  Visualize the scale of a seismic trace on an outcrop scale reservoir analog 
*  Visit a fully automated core storage facility
*  Read the hand-written notes on the margins of an original scout ticket  
*  Observe the beginning or end of a well test 
*  Hold a sample of a reservoir quality oil shale 
*  Look at the header block of a paper seismic section 

The Core Shed

As part of a data transition project, I had the opportunity to speak with a group of high-level business consultants from a leading global firm, onsite to audit the change management plan between two organizations transferring a major producing oil field. This is the kind of conversation that makes it worth being a professional petroleum data manager. These gentlemen (and yes in this case they were all gentlemen) wanted to know all about the data types that would be important to operating the field after the handover. Remember these guys are used to doing audits on banks and internet companies, with no oilfield experience. We talked about digital geoscience data, real-time operational data from the production platforms, QSHE, maintenance and logistics records, asset registers from SAP, and personnel on board logs. They hung in pretty well until the end of the conversation, which went something like this: CG (Consultant Guy) So what other data types are there? DG: (Data Guy, that’s me) We have physical data too. CG: You mean like documents in offsite storage? DG: Yes, and some data we have stored in a core laboratory. CG: A core laboratory, what is it? DG: It’s a big building with an automated forklift, but that’s not important right now. DG: No seriously, what is it? DG: It’s a warehouse. CG: What’s in it? DG: Rocks. CG: Excuse me? DG: Rocks. We’re geologists, we study rocks, and we need someplace to keep them. CG: So the national oil company is paying to keep a building full of rocks? Surely you can’t be serious. DG: I am serious, and stop calling me Shirley. Actually, the rocks are quite valuable. CG: Like how much? DG: Well, when you consider the field acquisition and climate controlled storage ... CG: Wait, the rocks get climate controlled storage? DG: Yeah, otherwise boxes deteriorate, labels with critical metadata fall off, evaporative loss changes the water saturation, minerals can decay, … (noticing CG’s eyes glaze over) …and oil samples can degrade. CG: You keep your oil there too? DG: Yes, but only samples of fresh oil from the rocks … and then there is the CAT scan machine. CG: A CAT scan machine? For rocks? DG: Yeah, they run a scan on a rock sample, like they do on you in a hospital. CG: Like in a hospital? What is it? DG: It’s a big building where they keep sick people, but that’s not important right now. CG: No seriously, they do CAT scans on rocks? DG: Yeah, it lets you see internal structures like porosity, nodules, bedding and fractures … CG: They X-Ray rocks for fractures? Like they might have a broken bone in there? DG: Not quite, see there can be fractures in the rocks, and in a carbonate or shale reservoir the orientation of the fractures can help you … (seeing CG’s eyes glaze over again) … it can help support a decision about how to stimulate the reservoir. CG: You stimulate the rocks? DG: Yeah, sometimes we inject acid. CG: I totally believe that. Ok, so how much are these rocks worth? DG: Well like I was saying, if you count the value of the decisions being made, maybe around twenty million dollars. CG: Twenty million dollars of rocks? I gotta see this place. Where is it? DG: It’s about 90 kilometers from here CG: Can we go? DG: (Grabbing a cooler of beer) … ROAD TRIP!
Author’s note: If you have never seen a fully automated, research enabled core and sample library, it’s on the top 10 bucket list for Certified Petroleum Data Analysts

Thursday, March 4, 2021

AI and the Energy Transition

In late February I attended the Petroleum Club of Western Australia Industry Dinner on “Digitalisation, AI and Machine Learning in the Energy Sector”, billed as a discussion of how these technologies can contribute to productivity and the energy transition. The discussion got off to an early start as a participant in the Club’s Next Generation School Program described her experience in teaching students about technology using the example of a young boy given the job of tending a de-watering engine in an early English coal mine, who automated the process so that he could learn the game of marbles with his time instead 1. Her point was that automation does not threaten jobs, it gives workers the opportunity to learn new skills. The panel discussion included Anthony Brockman, General Manager of Software Integrated Solutions for Schlumberger Australasia and Far East Asia and located in Perth. His recurring theme was that Australia is a natural hub for digital leadership in the energy industry, with a robust university talent pool, access to all major energy resources (hydrocarbon, mineral and renewable), a relatively supportive government regime (for now), and a history of innovation and success (he noted that during WWII, an Australian P.O.W. led one of the most successful mass escapes from a German prison camp 2. He gave the example of a team from OMV’s Digital Excellence team in Austria who visited Perth to “see how digitalization works”, including an extended visit with Woodside, and he lamented that perhaps one of the few remaining barriers to the adoption of Artificial Intelligence is failing to see it as progress. In discussing Schlumberger’s partnership with innovative companies at their technology research lab in Palo Alto in California’s Silicon Valley, Brockman revealed that one insight obtained was that the level of collaboration was one of the only reliable predictors for success of digital initiatives, and that they would have to tackle the fear of people losing their jobs to machines in order to fully realize the potential of digital platforms like DELFI. Other panellists included Tom Georke, Innovation Centre Lead for Cisco in Perth, who observed that he considered the oil and gas industry to be only “nascent” in the adoption of machine assisted or data led decisions. He had his comparisons to the financial technology sector roundly challenged as perhaps not being the best example of using technology to maintain a “license to operate”. One of the more interesting observations of the evening came from Miranda Taylor, CEO of National Energy Resources Australia (NERA), who asserted that while many energy industries are very good at innovating internally, “you can’t scale innovation with a supply chain that operates in a silo”. This aligns well with NERA’s mandate to foster collaboration and innovation and help the energy resources sector respond to workforce trends 3. As a follow up to the discussion from Cisco, Richard Jones, VP of Asia Pacific for Dataiku, based in Singapore, polled the audience to see how many regularly used machine assisted recommendations. Only about 15% of the audience responded affirmatively, but I doubt many of them thought about letting their car GPS or public transport app tell them the best way to get to the hotel for the meeting, or odering something online that “others who ordered this also liked”. So maybe the best success criteria for artificial intelligence is that people shouldn’t realize they are using it 4. 1) 2) 3) 4)

Monday, February 4, 2013

Billion Year Old Carbon

On the second floor of the Natural History Museum of London, at the end of the gallery containing their extensive scientific collection of minerals, in a room called only “The Vault”, against the back wall, are displayed a series of meteorites. One of them, the Cold Bokkeveld carbonaceous chondrite from South Africa, is accompanied by a small glass vial, thinner than your little finger. And at the bottom of that vial is an almost indistinguishable white smudge, that if your attention was not called to it, you might mistake for a fingerprint. But a quick reading of the accompanying label will stop you in your tracks. The material in that vial was obtained by vaporizing a small sample of the rock, and contains microscopic diamonds created in the interstellar void by the pressure of a supernova explosion, then transported to earth by the meteorite. Their unique istopic xenon ratios hint at an origin that predates our entire solar system, making them, in the words of the museum, quite simply “the oldest things you will ever see”. They are, literally, stardust on Earth, the “billion year old carbon” alluded to by Joni Mitchell, and possibly the most humbling item in the museum’s entire collection given their size. I, personally, found myself rooted to the spot for several full minutes. As a geologist, I have seen examples of the Jack Hills zircons from Australia, the oldest material on earth, and I have found ways to think at scales that far exceed my own experience. But here was something that physically pushed those boundaries even further in both space and time, and I struggled with that comprehension all afternoon as I made my way back past the Jurassic fish skeletons and centuries-old tree rings that now seemed almost trivial, back out to the bustle of human existence that now, even more than ever, I am forced to recognize represents no more than an eyeblink and a mote in our larger understanding of the universe. I highly recommend the experience to anyone. References and connections:

Saturday, March 10, 2012

Enabling an Exploration Data Access Strategy with Spatial Discovery

Preview of text of a whitepaper, results presented by RioTinto at the Prospectors and Developers Association of Canada 2012 International Convention, Trade Show & Investors Exchange in Toronto, Canada

"Enabling an Exploration Data Access Strategy with Spatial Discovery"

Jess Kozman, QBASE; David Hedge, Conducive Pty Ltd

“..Spatial attributes are pervasive in energy geotechnical data, information and knowledge elements, and users expect enterprise search solutions to be map-enabled.. “

Search solutions, similar to those already deployed at oil and gas companies, are now being piloted in both the mineral extraction and carbon sequestration segments of the resources industry. A recent pilot project in the exploration division of a major mining company within Australia has demonstrated the significant value gained from the effective integration of Enterprise Search technology, Natural Language Processing (NLP) for geographic positioning (geo-tagging) and Portal delivery. Branded internally as Spatial Discovery, the pilot project is part of a larger strategy to discover globally, access regionally, and manage locally the data, information and knowledge elements utilized in the mineral exploration division, namely; Geochemical and Geophysical, documents (both internal and external as well as those stored in structured and unstructured data repositories), GIS map data, and geo-referenced image mosaics. The initial stage involved validating a technology for spatial searches to enable streamlined, intelligent access to a collection of scanned documents by secured users, through scheduled automated crawls for geo-tagging, and following corporate security guidelines. This stage also included administrator training, including defining procedures for managing the document collections, procedures for maintaining the hardware appliance used for generating the spatial index, customizing the User Search Interface, and developing and implementing support roles and responsibilities. Functionality testing was run on a subset of documents representative of the enterprise collections that would need to be addressed by the Exploration Data Access (EDA) solution.

The next stages will focus on broadening the user base, with a goal of having access and use by all corporate geoscientists. This will be accomplished by defining, prioritizing and publicizing the spatial indexing of additional document collections, developing a methodology for managing and enhancing a custom gazetteer with geographic place names specific to the Australian mineral industry, and integrating with existing GIS map layers such as land rights. A proof of concept user interface will be rolled out to a selected User Reference group for input and feedback. The ongoing stages will be supported by utilizing a recently delivered testing and development hardware appliance, implementing connectors to existing electronic document management systems (EDMS) as well as portal delivery systems and SQL data stores, and a complete feature rich enhancement of the User Interface. This stage will also align the Spatial Discovery project with the larger Exploration Data Access (EDA) initiative and provide a proof of concept for enterprise search strategies based on best practices from other resource industries.

An essential part of this stage is the creation of a Customized Gazetteer to work with the NLP engine and geo-tagging software, which identifies geographic place names in text from multiple formats of unstructured documents and categorizes the index by location types such as country, region, populated place, mines, Unique Well Identifiers (UWI), camps, or concession identifiers. The index also allows sorting of search results by relevance based on natural language context, and Geo-Confidence, or the relative certainty that a text string represents a particular place on a map.

Future improvements to the system will include increasing the confidence in geo-tagging to correctly identify ambiguous text strings such as “WA” in locations and street addresses from context. This will correctly give documents referencing Asia Pacific regions a higher probability of “WA” referring to “Western Australia” instead of the default assignment to “Washington”, the state in the United States. The natural language processing engine can be trained using a GeoData Model (GDM) to understand such distinctions from the context of the document, and can utilize international naming standards such as the ISO 3166-2 list of postal abbreviations for political subdivisions such as states, provinces, and territories. The capabilities of the natural language processing engine to use grammatical and proximity context become more important for the correct map location of documents when a populated place such as “Belmont, WA” exists frequently in company documents because of the location of a data center in Western Australia, for example, but could be confused with the city of Belmont, Washington, in the United States without contextual clues.

The NLP engine is made more robust by an understanding of relative text strings such as “30 km NW of Darwin” and support for foreign language grammar and special characters such as those in French and Spanish. The current NPL engine also has the ability to locate and index date text strings in documents so that documents can be located temporally as well as spatially. Next stages of the deployment will include improvements to the current Basic User Interface such as automatic refresh of map views and document counts based on selection option context, support for the creation of “electronic data room” collections in EDMS deployments, URL mapping at directory levels above a selected document, and the capture of backup configurations to preserve snapshots of the index for version control of dynamic document collections such as websites and news feeds. The proof of concept User Interface already includes some innovative uses of user interface controls, such as user-selectable opacities for map layers, the ability to “lock” map refreshes during repeated pans, and utilities for determining geoid centers of polygonal features. Further results of the pilot show that there is the potential to replace the connectors currently in use, enabling an enterprise keyword search engine (EKSE) to perform internal content crawls and ingest additional document types and to pass managed properties to the geo-tagger to enhance the search experience. The performance of remote crawling versus having search appliances physically located in data centers is also being evaluated against the constraints of limiting the content crawled from individual documents. The pilot project is designed to validate the ability of the geo-tagging tool to share an index with enterprise keyword search engines, and to use Application Programming Interfaces (API's) to provide the results of document ingestion and SQL-based structured data searches to both portal delivery systems and map-based “mash-ups” of search results.

The goals of the successful proof of concept stage were; to demonstrate that the geo-tagger could ingest text provided by the keyword search ingestion pipe, without having to duplicate the crawl of source documents; to use metadata from keyword search for document categorization such as product type, related people, or related companies; and to provide a metadata list of place names, confidence and feature types back to the search engine. The resulting demonstrated functionality moves towards providing “Enterprise Search with Maps”. The completed EDA project is sponsored by the head of exploration and will remove the current “prejudice of place” from global search results for approximately 250 geotechnical personnel for legacy data and information, in some cases dating back to 1960. The solution supports a corporate shift in focus from regional activity focused on projects and prospects with a 24 to 36 month timeline to move to global access that will no longer be biased toward locations with first world infrastructure, and eliminate the need for exploration personnel to take physical copies of large datasets into areas with high geopolitical risk. The corporate Infrastructure Services and Technology (IS&T) group is the main solution provider in the project with the ongoing responsibility for capacity, networking and security standards management. The deployed solution will have to support search across global to prospect scales, and roles including senior management, geoscience, administrative, data and information managers, research and business development. The focus is on a single window for data discovery that is fast and consistent, with components and roles for connected search and discover solutions. The entire solution will be compatible with the architecture used for the broader context of a discovery user interface and data layer for mineral exploration.

Further work identified during the Proof of Concept included developing strategies for documents already ingested prior to establishing the keyword search pipe, merging licensing models for the keyword and spatial search engines, and adding full Boolean search capability to the spatial keyword functions, In the current implementation, the user is supplied with a larger search result from the keyword search, while the spatial search returns only those documents with spatial content that allows them to be placed on a map. Conversely, the keyword results will receive place name metadata for searching, but will be limited in map capabilities. Identified benefits from the Proof of Concept were that separate collections of documents did not need to be built for the spatial search engine, the single crawler reduced load on the file repository, and additional connector framework development was not required. The next stage will validate a security model managing document security tokens in the ingestion pipe.

The baseline architecture was also validated during the Proof of Concept phase. In this architecture, the enterprise keyword search engine (EKSE) passes text from crawled documents individually to the enterprise spatial search engine (ESSE). The ESSE then extracts metadata and processes text using the Natural Language Processing (NLP) engine looking for geographic references. The ESSE passes back managed properties for locations, rating of confidence in location, and feature type (such as mining area, populated place, or hydrographical feature, and the GeoData Model (GDM) and Custom Gazetteer provide a database of place names, coordinates and features. The system is combined with an existing ESSE component licensed on production for 1 million geo documents, to be used with the geo-tagger stream processor. Geo-confidence results are being analyzed to evaluate the impact of misread characters from digital copies of documents produced through optical character recognition (OCR), and ambiguous character strings such as “tx” being an abbreviation for “transmission” in field notes for electromagnetic surveys as well as a potential spatial location (U.S. Postal abbreviation for Texas).

Recent technology partnerships have included providers of Web Map Services (WMS) that incorporate the idea of large amounts of static or base layer data (land boundaries, Geo-Referenced images and grids) overlain by dynamic operational data such as geophysical and geochemical interpretations. Other development strategies may include launching search in context from analytic applications, conforming to public OGC standards, using the “shopping cart” concept of commercial GeoPortals, and arranging spatial metadata and taxonomies along the lines of ISO content categories.

The pilot project team identified several achievements from the Proof of Concept phase. Documents ingested by the keyword search engine that had place name references were successfully located on the user map view. Categories passed from the keyword search such as source or company names were able to be searched in the spatial search engine as document metadata. Also, feature types and place names with location confidences were provided, appearing on the spatial search page as managed properties. The system will be enhanced in the deployment phase security implemented by passing access control lists associated with each document through the ingestion pipeline, and processing for replicated security in the spatial search engine. Improved presentation of returned managed properties will allow them to be managed for use as a refined list. Search categories can be selectable from an enhanced user interface to allow, for example, selection of a product type for search refinement. This will complement the current Boolean search parameters available in the map view.

The enhanced User Interface also presents the density of search results, the directory location of located documents, and the file type of the document. The map view also allows a more AustralAsia centric map experience by removing the arbitrary “seam” at the International Data Line (Longitude 180 degrees) so the region can be centered on a map.

The concept of “Enterprise Search with Maps” will be driven as part of the architecture of the Exploration Data Access project, and the level of integration may be impacted by decisions of future versions of the corporate portal. Next steps include
evaluating the relative costs and benefits of the enterprise licenses and how they are consumed and checked out during the crawl and display processes, the potential use of licenses for each active geo-tagged document versus the use of managed properties, direct indexing of spatial databases and geotechnical repositories using the keyword search engine, and security implementation. A third party application is also being used to scan and categorize doucments discovered with GeoTagging in order to extract and protect potentially sensitive information.

The finalized solution will provide a holistic search interface that allows geotechnical users to answer essential questions about both the structured and unstructured data in their enterprise, improving efficient access to mission critical data and reducing the risk of geotechnical decisions.

Thursday, January 19, 2012

The science of shale drilling

The Houston Geological Society, which bills itself on its website as “the world’s largest local geological society” (and if Houston continues to hold its place among the five Texas cities in the top ten on the Men’s Health annual list of “America’s Fattest Cities", this may be literally true), held an Environmental and Engineering Dinner meeting on 14-Dec to discuss the technology behind risk mitigation for shale gas development. Anthony Gorody, president of Universal Geoscience Consulting, Inc., has been very involved in baseline groundwater sampling and forensic analysis of stray gas in shallow water wells. This was especially timely given the recent identification in Pavillion, Wyoming of compounds in two deep monitoring wells (see the draft report at: that the EPA described in their press release as “consistent with gas production”. Mr. Gorody repeated the industry’s stance that relatively few water wells are impacted by drilling operations and that to date there have been no unambiguously documented cases of groundwater contamination directly attributed to hydraulic fracturing itself. Instead, he showed that many issues have been with stray gas being released from insufficient cement jobs rather than completion operations (note even the EPA report references “gas production” first rather than hydraulic fracturing). Compounding the issue is the fact that drilling operations have now moved from areas like Wyoming, where a one mile radius of investigation around a gas well might encounter only two water wells, to populated lands in Pennsylvania, where the same footprint might include up to 60 privately drilled water wells. Gorody noted that hydraulic fracturing is not a new technology (it has been in use since 1947) and showed what may be the best image in the public domain to explain the relationship between the depth of water wells and the depth and extent of hydraulic fracturing, which should allow the general public to understand the physical inability of artificial fractures to propagate to groundwater levels in a play like the Barnett.
He used some basic physics to show that small shale pores are two orders of magnitude smaller than the molecular sizes of larger hydrocarbons, and that fractures must be vertically confined in order to create strain release, and slammed the recent EPA report for confusing “coincidence and collocation with cause and effect”. He gave a rule of thumb that the energy released in a typical fracture job is about the same as dropping a gallon of water from head height, while the audience tried to imagine that amount of energy fracturing over 3000 feet of consolidated rock. The high visibility and impact of shale gas drilling operations, however, with 3-5 million gallons of water being trucked in per well, and highly mobile rigs moving through what used to be rural countrysides, has led many community organizations to cast a skeptical eye on the industry. Gorody emphasized the impact on instrumentation and monitoring technology, showing that pressure plots and noise surveys do not show any evidence of fluid or gas release, that increased sampling and analysis of gas shows is providing a fossilized history of hydrocarbon expulsion in many basins, and that baseline water sampling is providing a boon in data density and data mining for forensic geochemists studying aquifers, paid for by risk-averse and litigation-wary operators. The voluntary and regulatory release of chemical information supports studies that show produced gases are not isotopically the same as gases found in water wells, and that buoyant hydrocarbons from depth escaping from failed casings, uncemented annuli, and compromised casing cement bonds can invade shallow aquifers and re-suspend colloidal complexes and sediments that have normally settled to the bottom of water wells. This creates a reducing environment in the well pump intake port, and the bacterial conversion of toxic sulfides that are then reported as odiferous and noxious. It only takes 87 psi of stray gas to overcome hydraulic head and invade water wells drilled to 200 feet, that is less than the pressure in a standard bicycle tire. Since around 85% of the population can detect sulfides such as H2S at levels of .03 ppm, the population quickly becomes aware of the degradation of their water supply, but Gorody noted that this process is completely independent of the hydraulic fracturing used to complete the well, and that maybe the term “hydraulic fracturing” should not even be used when describing vertical wells. Gorody pointed out that as soon as the annulus is squeezed, the problems with water contamination that occur in the first weeks to months after drilling go away, and that nearby monitoring wells may fail to intersect the tortuous paths used by the stray gas to migrate between the gas producing and water wells. His suggestions for reducing these impacts include monitoring of mud logs for gas shows during drilling, cementing off shallow gas shows to prevent leakage, sampling gas for its isotopic fingerprint during drilling to differentiate it from produced gas, running cement bond logs, and coproducing or venting casinghead gas. In the end, it was very enlightening to hear such a sober, scientific evaluation of the technology being used to track gas in shale plays, as opposed to the usual dialogue in places like mainstream media.

Wednesday, December 21, 2011

Chevron at the SPE Digital Energy Study Group

Jim Crompton, Senior IT Advisor for Chevron, addressed the SPE Digital Energy Study Group in Houston on 16-November on the topic of the “Digital Oil Field IT Stack”. He announced that he wanted to address and be a bit provocative about what he described as two recognized barriers that came out of the panel discussions at the SPE ATCE in Denver a couple of weeks before. His experience comes from implementing what he described as “gifts” from the Chevron central organization in diverse business units for real world application. He felt the two unaddressed barriers were change management, and the need for a standard infrastructure and architecture, which he proposed to describe. His presentation started with some standard, but according to him, neglected trends in the expanding scope and role of IT, including increased digitization, a move into plants and fields, and the need to address the latest generation of IT consumers, which he described as the first generation of oilfield workers to have better IT infrastructure in their homes than at work. He acknowledged that in many cases, the “first kilometer” is still a problem, as where an entire offshore field may be instrumented with fibre optics, but the link to the onshore office is still via low bandwidth microwave links (he only half jokingly suggested lack of telcom coverage as a positively correlated indicator for oil occurrence). So how do we leverage the hundreds of thousands of sensors on a new greenfield platform and move from a “run to failure” mode to one of proactive failure detection and avoidance? Jim cited some examples of predictive analytics, Statoil’s experiements with injected nano sensors that report back on reservoir conditions, distributed sensors for real-time optimization, and new mobility platforms for field workers. But the most interesting new idea was that of borrowing sensor mesh architectures from agricultural and military applications to go beyond current de-bottlenecking workflows and address the advanced analytics used by electrical engineers in their instrumentation. He indicated such a robust and cheap architecture “pattern” might be one of maybe half a dozen that an IT group like Chevron’s might use to provide semi-customizable solutions. Part of the frustration he acknowledged was that at least at Chevron, his best Visual Basic programmers are petroleum engineers using Excel, and they are more in touch with MicroSoft development plans than his IT group and upset that the next version of Excel will remove Visual Basic and move it to the Sharepoint platform. Faced with Chevron now having over 20 million Gigabytes of digital data under management, he suggested treating the information pipeline in the same way we manage hydrocarbon pipelines, and trying to prevent “leaks” to unmanaged environments, like Excel. He showed some digital dashboards that could provide a balance between real time surveillance and advanced modeling, mix the needs of mapping and reporting services, and move organizations up the Business Intelligence maturity model. He finished with a quick nod to HADOOP solutions and a need to move away from “creative solutions that only solve when the creator is present”.