Monday, November 25, 2013

Wednesday, November 13, 2013

Amazon Web Services and NASA team up to provide public processing of Earth Observation data

NASA / NEX Public Data Sets on AWS
I've been talking to Amazon Web Services (AWS) lately about their public datasets program. These datasets are hosted by AWS for free to provide easy access to data and compute without having to deal with all the costs involved in some of the larger datasets. The projects I'm involved with all deal with Government (public) data sets, particularly geoscience data. Some datasets are very large. My team and I've looked at using the commercial Clouds for these purposes before but for some cases the data management issues are prohibitive. However, with AWS hosting the public data in the cloud to start with the possibilities to innovative and provide services on top of that data start to become viable again. It would also allow us to put our research infrastructure in a place readily accessible to industry early adopters.

AuScope Virtual Geophysics Laboratory
- geophysics data, tools and cloud computing for researchers
NASA and AWS have been working on this with a bunch of Landsat, MODIS and Climate data products. The key of course is to have both the data and tools available - and this means porting some of our services to the AWS way of doing things. At first glance that isn't difficult to do but is it valuable enough for someone to support the upfront work?

Aside the satellite data, I think national geophysics data and the AuScope National Virtual Core Library are good candidates and the AuScope Virtual Geophysics Laboratory provides some tools and is available to researchers on the NeCTAR research cloud. Porting this to AWS would provide industry and Government users with access. A natural complement to the National Virtual Core Library would be a Drill Hole Virtual Laboratory providing data integration, analysis, and domaining for the NVCL holes. You could even combine these and use the drill holes as constraints in geophysical inversion. Buy enough cloud compute and you might just be able to do the entire Australian continent all at once!

Exciting times ahead I'm sure.

Thursday, October 31, 2013

Searching the GA CSW catalogue using a spatial-temporal, protocol and keyword filter

A few posts ago I published some code to access the GA Landsat scene archive via python - Accessing the Geoscience Australia ARG25 OGC Services from Python.

My tool of choice that day was Python using OWSLib 0.7. Since then OWSLib has undergone some significant developments and is at version 0.8.2. The new library has good support for Filter queries, which means you can now construct a spatio-temporal and attribute query against the CSW service.

This only impacts the code for the first step so I'll just include an example which will search for "landsat-5" data in the time period between 1/1/2010 and 1/1/2011 touching the Australian Capital Territory bounding box. Don't forget to use your favorite python package manager to download the new OWSLib from pypi.

Step 1: Searching the catalogue using a spatial-temporal, protocol and keyword filter


Monday, August 12, 2013

AuScope - infrastructure for Solid Earth Informatics

AuScope recently held a symposium event showcasing the many components which make up its national research infrastructure. Ranging from data collection campaigns, instrumentation for high resolution spatial positioning, through web services for data access and computational analysis - AuScope has created a wealth of new data, tools and techniques to aid researchers in earth informatics, focused on the structure of the Australian continent. You can find out more about AuScope by checking out the symposium publications and listening to the webcasts.

My involvement with AuScope is in leading the development of the national data access infrastructure which is powered by OGC Web Services and a package of Free and Open Source Software we refer to as the Spatial Information Services Stack (SISS). SISS is deployed at Australian Government geological surveys and Universities. This is the same technology that powers the Geoscience Australia Australian Reflectance Grid web services I posted about previously.

You can watch the talk about the AuScope web services infrastructure and SISS here.

Monday, July 22, 2013

WMS for Geoscience Australia ARG25

In my last post I showed how to use Python and OWSLib to access the Geoscience Australia ARG25 web services. WMS is just as easy and here's a code snippet for that.

Saturday, July 20, 2013

Accessing the Geoscience Australia ARG25 OGC Services from Python

Geoscience Australia (GA) recently released a beta of their Australian Reflectance Grid 25 (ARG25) product using Open Geospatial Consortium (OGC) web services.
The services are comprised of the following:
  • a Catalogue Service for the web (CSW) allowing users to easily search ISO 19115 compliant metadata, which ARG25 data adheres to;
  • Web Map Services (WMS) allowing users to obtain images of the data as quick looks to determine if the full resolution image will meet requirements; and
  • Web Coverage Services (WCS) allowing users to obtain the full resolution data.
 GA also provide a web mapping portal you can use to search and download scences, but the best way is "my way" - and the web services allow you to do exactly that, search and connect directly to the archive using your own tools.

My tool of choice for today is Python using OWSLib 0.7.

OWSLib provides support for CSW, WMS and WCS. I won't be using WMS today as I want actual data from the WCS, not a pretty picture which is what the WMS provides.

Step 1: Searching the catalogue and extracting the WCS URLs

OWSLib makes this trivial. The GA CSW url is used to create a CatalogueServiceWeb object, pick your bounding box (in this case the Australian Capital Territory), then get the 'full' metadata records. 'full' records are needed to ensure we get the various web services URIs and not just a 'summary' metadata record. After that there is a bit of searching through the records URIs to find the WCS endpoints for each scene in the bounding box.

Note: In this example I've limited the number of return records to 5. There are something like 130,000 of them in the full catalogue. The metadatarecords also contain timebounds for the acquisition time of the imagery. OWSLib 0.7-dev has preliminary support for Filter construction including time spans. I've used the latest stable release of OWSLib for my example and haven't constructed a time filter. I may try it with 0.7-dev later - if you should happen to try it let me know and I'll add the extra example.


Step 2: Grab the data from the WCS

Each Coverage service contains multiple bands (e.g. 'Band1', 'Band7' ). You can also obtain related information about the support file formats the WCS can return, and the bounding box for the scene.
The code snippet below loops through the WCS urls obtained in Step 1, prints out some of this information and then gets the scene data for Band7 in GeoTiff format, saving them to a sequence of files as they go.



The GeoTIFF files (foo0.tif, foo1.tif...) can be opened using your favourite viewer. Here's the result for foo2.tif:


If you're not familiar with Landsat you may think foo0 and foo3 are failing as there is a stripping effect. That is the real data, not a problem with the code or services - I'll leave it to the reader to go find out what happen to cause "Landsat 7 stripping".

That's all there is to it, except error handling, unit tests, tidier python code (maybe this blog should be called "rusty middle aged coder").