Monday, February 4, 2013

On the second floor of the Natural History Museum of London, at the end of the gallery containing their extensive scientific collection of minerals, in a room called only “The Vault”, against the back wall, are displayed a series of meteorites. One of them, the Cold Bokkeveld carbonaceous chondrite from South Africa, is accompanied by a small glass vial, thinner than your little finger. And at the bottom of that vial is an almost indistinguishable white smudge, that if your attention was not called to it, you might mistake for a fingerprint. But a quick reading of the accompanying label will stop you in your tracks. The material in that vial was obtained by vaporizing a small sample of the rock, and contains microscopic diamonds created in the interstellar void by the pressure of a supernova explosion, then transported to earth by the meteorite. Their unique istopic xenon ratios hint at an origin that predates our entire solar system, making them, in the words of the museum, quite simply “the oldest things you will ever see”. They are, literally, stardust on Earth, the “billion year old carbon” alluded to by Joni Mitchell, and possibly the most humbling item in the museum’s entire collection given their size. I, personally, found myself rooted to the spot for several full minutes. As a geologist, I have seen examples of the Jack Hills zircons from Australia, the oldest material on earth, and I have found ways to think at scales that far exceed my own experience. But here was something that physically pushed those boundaries even further in both space and time, and I struggled with that comprehension all afternoon as I made my way back past the Jurassic fish skeletons and centuries-old tree rings that now seemed almost trivial, back out to the bustle of human existence that now, even more than ever, I am forced to recognize represents no more than an eyeblink and a mote in our larger understanding of the universe. I highly recommend the experience to anyone. References and connections:

Saturday, March 10, 2012

Enabling an Exploration Data Access Strategy with Spatial Discovery

Preview of text of a whitepaper, results presented by RioTinto at the Prospectors and Developers Association of Canada 2012 International Convention, Trade Show & Investors Exchange in Toronto, Canada

"Enabling an Exploration Data Access Strategy with Spatial Discovery"

Jess Kozman, QBASE; David Hedge, Conducive Pty Ltd

“..Spatial attributes are pervasive in energy geotechnical data, information and knowledge elements, and users expect enterprise search solutions to be map-enabled.. “

Search solutions, similar to those already deployed at oil and gas companies, are now being piloted in both the mineral extraction and carbon sequestration segments of the resources industry. A recent pilot project in the exploration division of a major mining company within Australia has demonstrated the significant value gained from the effective integration of Enterprise Search technology, Natural Language Processing (NLP) for geographic positioning (geo-tagging) and Portal delivery. Branded internally as Spatial Discovery, the pilot project is part of a larger strategy to discover globally, access regionally, and manage locally the data, information and knowledge elements utilized in the mineral exploration division, namely; Geochemical and Geophysical, documents (both internal and external as well as those stored in structured and unstructured data repositories), GIS map data, and geo-referenced image mosaics. The initial stage involved validating a technology for spatial searches to enable streamlined, intelligent access to a collection of scanned documents by secured users, through scheduled automated crawls for geo-tagging, and following corporate security guidelines. This stage also included administrator training, including defining procedures for managing the document collections, procedures for maintaining the hardware appliance used for generating the spatial index, customizing the User Search Interface, and developing and implementing support roles and responsibilities. Functionality testing was run on a subset of documents representative of the enterprise collections that would need to be addressed by the Exploration Data Access (EDA) solution.

The next stages will focus on broadening the user base, with a goal of having access and use by all corporate geoscientists. This will be accomplished by defining, prioritizing and publicizing the spatial indexing of additional document collections, developing a methodology for managing and enhancing a custom gazetteer with geographic place names specific to the Australian mineral industry, and integrating with existing GIS map layers such as land rights. A proof of concept user interface will be rolled out to a selected User Reference group for input and feedback. The ongoing stages will be supported by utilizing a recently delivered testing and development hardware appliance, implementing connectors to existing electronic document management systems (EDMS) as well as portal delivery systems and SQL data stores, and a complete feature rich enhancement of the User Interface. This stage will also align the Spatial Discovery project with the larger Exploration Data Access (EDA) initiative and provide a proof of concept for enterprise search strategies based on best practices from other resource industries.

An essential part of this stage is the creation of a Customized Gazetteer to work with the NLP engine and geo-tagging software, which identifies geographic place names in text from multiple formats of unstructured documents and categorizes the index by location types such as country, region, populated place, mines, Unique Well Identifiers (UWI), camps, or concession identifiers. The index also allows sorting of search results by relevance based on natural language context, and Geo-Confidence, or the relative certainty that a text string represents a particular place on a map.

Future improvements to the system will include increasing the confidence in geo-tagging to correctly identify ambiguous text strings such as “WA” in locations and street addresses from context. This will correctly give documents referencing Asia Pacific regions a higher probability of “WA” referring to “Western Australia” instead of the default assignment to “Washington”, the state in the United States. The natural language processing engine can be trained using a GeoData Model (GDM) to understand such distinctions from the context of the document, and can utilize international naming standards such as the ISO 3166-2 list of postal abbreviations for political subdivisions such as states, provinces, and territories. The capabilities of the natural language processing engine to use grammatical and proximity context become more important for the correct map location of documents when a populated place such as “Belmont, WA” exists frequently in company documents because of the location of a data center in Western Australia, for example, but could be confused with the city of Belmont, Washington, in the United States without contextual clues.

The NLP engine is made more robust by an understanding of relative text strings such as “30 km NW of Darwin” and support for foreign language grammar and special characters such as those in French and Spanish. The current NPL engine also has the ability to locate and index date text strings in documents so that documents can be located temporally as well as spatially. Next stages of the deployment will include improvements to the current Basic User Interface such as automatic refresh of map views and document counts based on selection option context, support for the creation of “electronic data room” collections in EDMS deployments, URL mapping at directory levels above a selected document, and the capture of backup configurations to preserve snapshots of the index for version control of dynamic document collections such as websites and news feeds. The proof of concept User Interface already includes some innovative uses of user interface controls, such as user-selectable opacities for map layers, the ability to “lock” map refreshes during repeated pans, and utilities for determining geoid centers of polygonal features. Further results of the pilot show that there is the potential to replace the connectors currently in use, enabling an enterprise keyword search engine (EKSE) to perform internal content crawls and ingest additional document types and to pass managed properties to the geo-tagger to enhance the search experience. The performance of remote crawling versus having search appliances physically located in data centers is also being evaluated against the constraints of limiting the content crawled from individual documents. The pilot project is designed to validate the ability of the geo-tagging tool to share an index with enterprise keyword search engines, and to use Application Programming Interfaces (API's) to provide the results of document ingestion and SQL-based structured data searches to both portal delivery systems and map-based “mash-ups” of search results.

The goals of the successful proof of concept stage were; to demonstrate that the geo-tagger could ingest text provided by the keyword search ingestion pipe, without having to duplicate the crawl of source documents; to use metadata from keyword search for document categorization such as product type, related people, or related companies; and to provide a metadata list of place names, confidence and feature types back to the search engine. The resulting demonstrated functionality moves towards providing “Enterprise Search with Maps”. The completed EDA project is sponsored by the head of exploration and will remove the current “prejudice of place” from global search results for approximately 250 geotechnical personnel for legacy data and information, in some cases dating back to 1960. The solution supports a corporate shift in focus from regional activity focused on projects and prospects with a 24 to 36 month timeline to move to global access that will no longer be biased toward locations with first world infrastructure, and eliminate the need for exploration personnel to take physical copies of large datasets into areas with high geopolitical risk. The corporate Infrastructure Services and Technology (IS&T) group is the main solution provider in the project with the ongoing responsibility for capacity, networking and security standards management. The deployed solution will have to support search across global to prospect scales, and roles including senior management, geoscience, administrative, data and information managers, research and business development. The focus is on a single window for data discovery that is fast and consistent, with components and roles for connected search and discover solutions. The entire solution will be compatible with the architecture used for the broader context of a discovery user interface and data layer for mineral exploration.

Further work identified during the Proof of Concept included developing strategies for documents already ingested prior to establishing the keyword search pipe, merging licensing models for the keyword and spatial search engines, and adding full Boolean search capability to the spatial keyword functions, In the current implementation, the user is supplied with a larger search result from the keyword search, while the spatial search returns only those documents with spatial content that allows them to be placed on a map. Conversely, the keyword results will receive place name metadata for searching, but will be limited in map capabilities. Identified benefits from the Proof of Concept were that separate collections of documents did not need to be built for the spatial search engine, the single crawler reduced load on the file repository, and additional connector framework development was not required. The next stage will validate a security model managing document security tokens in the ingestion pipe.

The baseline architecture was also validated during the Proof of Concept phase. In this architecture, the enterprise keyword search engine (EKSE) passes text from crawled documents individually to the enterprise spatial search engine (ESSE). The ESSE then extracts metadata and processes text using the Natural Language Processing (NLP) engine looking for geographic references. The ESSE passes back managed properties for locations, rating of confidence in location, and feature type (such as mining area, populated place, or hydrographical feature, and the GeoData Model (GDM) and Custom Gazetteer provide a database of place names, coordinates and features. The system is combined with an existing ESSE component licensed on production for 1 million geo documents, to be used with the geo-tagger stream processor. Geo-confidence results are being analyzed to evaluate the impact of misread characters from digital copies of documents produced through optical character recognition (OCR), and ambiguous character strings such as “tx” being an abbreviation for “transmission” in field notes for electromagnetic surveys as well as a potential spatial location (U.S. Postal abbreviation for Texas).

Recent technology partnerships have included providers of Web Map Services (WMS) that incorporate the idea of large amounts of static or base layer data (land boundaries, Geo-Referenced images and grids) overlain by dynamic operational data such as geophysical and geochemical interpretations. Other development strategies may include launching search in context from analytic applications, conforming to public OGC standards, using the “shopping cart” concept of commercial GeoPortals, and arranging spatial metadata and taxonomies along the lines of ISO content categories.

The pilot project team identified several achievements from the Proof of Concept phase. Documents ingested by the keyword search engine that had place name references were successfully located on the user map view. Categories passed from the keyword search such as source or company names were able to be searched in the spatial search engine as document metadata. Also, feature types and place names with location confidences were provided, appearing on the spatial search page as managed properties. The system will be enhanced in the deployment phase security implemented by passing access control lists associated with each document through the ingestion pipeline, and processing for replicated security in the spatial search engine. Improved presentation of returned managed properties will allow them to be managed for use as a refined list. Search categories can be selectable from an enhanced user interface to allow, for example, selection of a product type for search refinement. This will complement the current Boolean search parameters available in the map view.

The enhanced User Interface also presents the density of search results, the directory location of located documents, and the file type of the document. The map view also allows a more AustralAsia centric map experience by removing the arbitrary “seam” at the International Data Line (Longitude 180 degrees) so the region can be centered on a map.

The concept of “Enterprise Search with Maps” will be driven as part of the architecture of the Exploration Data Access project, and the level of integration may be impacted by decisions of future versions of the corporate portal. Next steps include
evaluating the relative costs and benefits of the enterprise licenses and how they are consumed and checked out during the crawl and display processes, the potential use of licenses for each active geo-tagged document versus the use of managed properties, direct indexing of spatial databases and geotechnical repositories using the keyword search engine, and security implementation. A third party application is also being used to scan and categorize doucments discovered with GeoTagging in order to extract and protect potentially sensitive information.

The finalized solution will provide a holistic search interface that allows geotechnical users to answer essential questions about both the structured and unstructured data in their enterprise, improving efficient access to mission critical data and reducing the risk of geotechnical decisions.

Thursday, January 19, 2012

The science of shale drilling

The Houston Geological Society, which bills itself on its website as “the world’s largest local geological society” (and if Houston continues to hold its place among the five Texas cities in the top ten on the Men’s Health annual list of “America’s Fattest Cities", this may be literally true), held an Environmental and Engineering Dinner meeting on 14-Dec to discuss the technology behind risk mitigation for shale gas development. Anthony Gorody, president of Universal Geoscience Consulting, Inc., has been very involved in baseline groundwater sampling and forensic analysis of stray gas in shallow water wells. This was especially timely given the recent identification in Pavillion, Wyoming of compounds in two deep monitoring wells (see the draft report at: that the EPA described in their press release as “consistent with gas production”. Mr. Gorody repeated the industry’s stance that relatively few water wells are impacted by drilling operations and that to date there have been no unambiguously documented cases of groundwater contamination directly attributed to hydraulic fracturing itself. Instead, he showed that many issues have been with stray gas being released from insufficient cement jobs rather than completion operations (note even the EPA report references “gas production” first rather than hydraulic fracturing). Compounding the issue is the fact that drilling operations have now moved from areas like Wyoming, where a one mile radius of investigation around a gas well might encounter only two water wells, to populated lands in Pennsylvania, where the same footprint might include up to 60 privately drilled water wells. Gorody noted that hydraulic fracturing is not a new technology (it has been in use since 1947) and showed what may be the best image in the public domain to explain the relationship between the depth of water wells and the depth and extent of hydraulic fracturing, which should allow the general public to understand the physical inability of artificial fractures to propagate to groundwater levels in a play like the Barnett.
He used some basic physics to show that small shale pores are two orders of magnitude smaller than the molecular sizes of larger hydrocarbons, and that fractures must be vertically confined in order to create strain release, and slammed the recent EPA report for confusing “coincidence and collocation with cause and effect”. He gave a rule of thumb that the energy released in a typical fracture job is about the same as dropping a gallon of water from head height, while the audience tried to imagine that amount of energy fracturing over 3000 feet of consolidated rock. The high visibility and impact of shale gas drilling operations, however, with 3-5 million gallons of water being trucked in per well, and highly mobile rigs moving through what used to be rural countrysides, has led many community organizations to cast a skeptical eye on the industry. Gorody emphasized the impact on instrumentation and monitoring technology, showing that pressure plots and noise surveys do not show any evidence of fluid or gas release, that increased sampling and analysis of gas shows is providing a fossilized history of hydrocarbon expulsion in many basins, and that baseline water sampling is providing a boon in data density and data mining for forensic geochemists studying aquifers, paid for by risk-averse and litigation-wary operators. The voluntary and regulatory release of chemical information supports studies that show produced gases are not isotopically the same as gases found in water wells, and that buoyant hydrocarbons from depth escaping from failed casings, uncemented annuli, and compromised casing cement bonds can invade shallow aquifers and re-suspend colloidal complexes and sediments that have normally settled to the bottom of water wells. This creates a reducing environment in the well pump intake port, and the bacterial conversion of toxic sulfides that are then reported as odiferous and noxious. It only takes 87 psi of stray gas to overcome hydraulic head and invade water wells drilled to 200 feet, that is less than the pressure in a standard bicycle tire. Since around 85% of the population can detect sulfides such as H2S at levels of .03 ppm, the population quickly becomes aware of the degradation of their water supply, but Gorody noted that this process is completely independent of the hydraulic fracturing used to complete the well, and that maybe the term “hydraulic fracturing” should not even be used when describing vertical wells. Gorody pointed out that as soon as the annulus is squeezed, the problems with water contamination that occur in the first weeks to months after drilling go away, and that nearby monitoring wells may fail to intersect the tortuous paths used by the stray gas to migrate between the gas producing and water wells. His suggestions for reducing these impacts include monitoring of mud logs for gas shows during drilling, cementing off shallow gas shows to prevent leakage, sampling gas for its isotopic fingerprint during drilling to differentiate it from produced gas, running cement bond logs, and coproducing or venting casinghead gas. In the end, it was very enlightening to hear such a sober, scientific evaluation of the technology being used to track gas in shale plays, as opposed to the usual dialogue in places like mainstream media.

Wednesday, December 21, 2011

Chevron at the SPE Digital Energy Study Group

Jim Crompton, Senior IT Advisor for Chevron, addressed the SPE Digital Energy Study Group in Houston on 16-November on the topic of the “Digital Oil Field IT Stack”. He announced that he wanted to address and be a bit provocative about what he described as two recognized barriers that came out of the panel discussions at the SPE ATCE in Denver a couple of weeks before. His experience comes from implementing what he described as “gifts” from the Chevron central organization in diverse business units for real world application. He felt the two unaddressed barriers were change management, and the need for a standard infrastructure and architecture, which he proposed to describe. His presentation started with some standard, but according to him, neglected trends in the expanding scope and role of IT, including increased digitization, a move into plants and fields, and the need to address the latest generation of IT consumers, which he described as the first generation of oilfield workers to have better IT infrastructure in their homes than at work. He acknowledged that in many cases, the “first kilometer” is still a problem, as where an entire offshore field may be instrumented with fibre optics, but the link to the onshore office is still via low bandwidth microwave links (he only half jokingly suggested lack of telcom coverage as a positively correlated indicator for oil occurrence). So how do we leverage the hundreds of thousands of sensors on a new greenfield platform and move from a “run to failure” mode to one of proactive failure detection and avoidance? Jim cited some examples of predictive analytics, Statoil’s experiements with injected nano sensors that report back on reservoir conditions, distributed sensors for real-time optimization, and new mobility platforms for field workers. But the most interesting new idea was that of borrowing sensor mesh architectures from agricultural and military applications to go beyond current de-bottlenecking workflows and address the advanced analytics used by electrical engineers in their instrumentation. He indicated such a robust and cheap architecture “pattern” might be one of maybe half a dozen that an IT group like Chevron’s might use to provide semi-customizable solutions. Part of the frustration he acknowledged was that at least at Chevron, his best Visual Basic programmers are petroleum engineers using Excel, and they are more in touch with MicroSoft development plans than his IT group and upset that the next version of Excel will remove Visual Basic and move it to the Sharepoint platform. Faced with Chevron now having over 20 million Gigabytes of digital data under management, he suggested treating the information pipeline in the same way we manage hydrocarbon pipelines, and trying to prevent “leaks” to unmanaged environments, like Excel. He showed some digital dashboards that could provide a balance between real time surveillance and advanced modeling, mix the needs of mapping and reporting services, and move organizations up the Business Intelligence maturity model. He finished with a quick nod to HADOOP solutions and a need to move away from “creative solutions that only solve when the creator is present”.

Saturday, February 19, 2011

A Once in a Thousand Year Event?

Some new work along the southeastern tip of India shows that the Boxing Day Tsunami was rare, but not unprecedented. Now that scientists know what the erosional remnants of a global tsunami event look like when preserved on the beaches of that coast, they can use Ground Penetrating Radar (GPR) and sediment cores to look for evidence of other events correlated around the Indian Ocean basin, and use optical methods to come up with dates for them. The latest round of work has identified two previous tsunami records at 1080 years ago (+/- 60 years) and 3710 years ago (+/- 200 years). So yes those of us who witnessed this event were indeed present for an event that starts to bridge the gap between human history and the geologic record.
See: (EOS, Transactions, American Geophysical Union, Vol. 91, No. 50, 14-Dec-2010, "Subsurface Images Shed Light on Past Tsunamis in India", Rajesh Nair, Dept. of Ocean Engineering, Indian Institute of Technology)

In the News ... Again

In following up the scientific response to the BP Gulf of Mexico Oil Spill (yes the media tagged it and it will never be the "Cameron BOP spill" or the "Anadarko Joint Venture spill"), there is some real insight from those who deal every day with complex technological ventures. In a pretty good indication that, yes, scientists are the pragmatic lot that we expect and need them to be, I have now come across at least two admissions in technical and scientific and publications that when it comes to huge expensive undertakings like deep offshore drilling, the next spill is not a matter of if, but when.

When the National Oceanic and Atmospheric Administration(NOAA)stood up their GeoPlatform website in response to the spill, their CIO was quite candid in noting that they were already planning how the IT infrastructure would have to evolve in order to meet the "next crisis". See:

And Case Western Reserve University has received a grant from the National Science Foundation to study an aerogel material that can soak up eight times its weight in oil, and then be wrung out and re-used. The goal is to lower the cost of the gel so it can be used "during the next big spill".

Those who launch people into space, build high energy physics labs, or even integrate complex software suites, and do it under budgetary constraints, live with a harsh reality. The technicians who are even today, as Paul Carter describes in "This is Not a Drill", designing the "whole fleets of brand new sixth generation, fly by wire cyber rigs ... getting spat out of shipyards all over the world at the moment" ... they know it.

When you push the technology to its limits, sooner or later, something will go wrong.

Thursday, February 17, 2011

How did this trajectory start?

In the book that will eventually trace the course of this particular scientist through the global oilfields, speculatively titled "Hold My Beer and Watch This!", I will undoubtedly have to spend some time explaining how a short intellectual kid from Chicago ended up driving a 27-ton Litton Vibrator Truck in Pecos, Texas. In his book "This is Not a Drill: Just Another Glorious Day in the Oilfield", Paul Carter describes some of the motives that led him to join offshore rig crews; namely wanderlust, camaraderie, and lucrative contracts. Interestingly, these were among the same things listed by Frank "The Irishman" Sheerhan in the book "I Hear you Paint Houses" as reasons for him joining the Mob....
In my case it was not only the prospect of a lucrative job actually using my college degree when the mining business was collapsing around Upper Michigan in the early 1980's, the possibility to work in remote exotic locations (ok, but Pecos?) and knowing I would be working with geoscientists who I already knew to be a friendly and jovial lot, but the fact that at that time, oil companies were actually using some of the spiffiest technological equipment of the times. I mean, we had access to computers!
I could actually submit a seismic processing job from a teletype terminal in Midland, Texas, and have it checked and submitted by a computer operator in The Woodlands outside of Houston the same day. I knew I had picked the right industry when, in the mid 1980's, the U.S. government decided they could help fund the big government labs by finding commercial applications for some of the technology. When Los Alamos in New Mexico went looking for industry customers, one of the first segments they turned to was "Big Oil". I found myself on a trip from Dallas Texas to Albuquerque New Mexico with a delegation of oil and gas technologists to get a first look at what the weapons guys had been doing inside of the top secret walls that housed the Manhattan project in it's day. We didn't get "inside the wall" where they do the real crazy stuff, and our unfortunately Iranian-born Vice President didn't even get that far, his clearance was denied at the gate and he spent the day in the hotel and looking at "Fat Man" and "Little Boy" in the museum. But the conversations we had around the conference table that day were pretty interesting.
"Oh so you want a way to reduce engine noise on a ship so you can listen better to sonic waves? ... yeah we can do that"
"Oh so you would like to be able to run huge 3D process simulation using parallel processing and hierarchical storage of modeling data? ... yeah we can do that"
And when the previously cloistered government scientists from the weapons lab met the oilfield completion engineers working on downhole perforation guns for deep drilling, it got really interesting:
"Oh it would be good if you could direct a shaped explosive charge to blow a precisely oriented hole through thick steel casing from a few miles away? Hell Son, we do that every god-damned day around here! Wanna come out to the range and see it?"
Later in the day I got to walk through what was then one of the largest computers on the planet, the Thinking Machines CM-2 massively parallel hypercube array, and when I say walk through, that's exactly what I mean. You didn't stand and look at this computer, you walked into it! I knew it was big when I saw them wheeling in a standard workstation like the ones we were using at the time to run our 3D visualizations, on a cart, and start to use it to run a backup of just part of the array.
They also had a Cray-2 there, the same model I ran into in the Musée des Arts et Métiers in Paris when my wife and I visited for our 25th anniversary in 2006. I was later to find out it was not only the same model, but in fact the very same machine I had reverently laid my hand on to feel the chilled water cooling system when it was running simulated nuclear explosion models in New Mexico two decades earlier and an ocean away. Now where else but the oilfield could you make a connection like that?