Toward a Service Oriented Architecture for VENUS and NEPTUNE
CANARIE Project CIIP-19
About CANARIE
CANARIE, Canada's advanced Network development organization, is a
not-for-profit
corporation supported by its members, project partners and the Federal
Government. Since 1993, CANARIE has received more than $350 million from
the Government of Canada. That funding has been used for the research and
implementation of advanced networks and applications that stimulate economic
growth and increase Canada's international competitiveness. CAnet 4 is the
fourth generation of Canada's research and education network. CANARIE's
mission is to accelerate Canada's advanced network development and use by
facilitating the widespread adoption of faster, more efficient networks and by
enabling the next generation of advanced products, applications and services to
run on them.
Project Goals
The goals of this project are to provide the VENUS and NEPTUNE Canada
Cabled Ocean
Observatories with an integrated scientific instruments
management, the capability to deliver
event information to users, as
well as integrated access to distributed compute and data
resources
through the use of innovative technologies. The wealth of new and old
data will be
easily exploitable through the use of workflow
orchestration tools, existing grid processing
infrastructures
(including the high speed networks interconnecting them) and the
underlying
technologies related to web services. The expected
results will include a rapid enrichment of the
data archive, turning
raw data into directly exploitable information much faster and in a
more
elegant way. The project is innovative in that it will provide
an integrated approach in dealing
with many different data sources
(sensors and databases) while at the same time defining an
approach
to accelerate the extraction of knowledge from the data
collected. Canada will rapidly
benefit from the new architecture and
management technologies that will be developed given
the aggressive
schedule of the project. Moreover, the country's profile abroad will
be raised
significantly as other organizations world-wide are
contemplating the construction of similar
ocean observatories and
are watching very closely our progress.
For more information about the project please have a look at
the Statement of Work
Project Description
An integral part of the VENUS and NEPTUNE Projects along with
instrumentation and the “wet
plant”, the data management and
archiving system (DMAS) will be in charge of the 24/7 data
acquisition of a whole array of instruments and sensors, of their long
term data storage and
retrieval and of the network resources
management.
The varied nature and the large number of fixed and mobile sensors
that will be deployed, as
well as the frequency of new instrument
arrivals or displacements calls for the implementation
of a system
that will respond dynamically and autonomously to configuration
changes.
Moreover, the vast amount of data produced (petabytes will be
available in the archive after few
years of operation) signals the
need for powerful, efficient and intelligent data processing and
analysis systems that will mine the live as well as archived data
streams to detect trends,
classify content and extract features,
feeding the results back into the master database, thereby
turning
raw data into information. Finally the information will be transformed
into knowledge by
the scientists. Both of the above aspects are
challenging, but approaches based on recent
cyber-infrastructure
concepts will allow the use of innovative solutions to the problems at
hand.
The proponents of this project believe that the deployment of a
Service-Oriented Architecture
relying on Web Services will provide
an elegant solution to the first problem above, whereas
workflow
orchestration techniques will be instrumental in helping scientists
assemble complex
processing chains to be executed amidst an
ubiquitous grid infrastructure. While not solving the
actual
scientific problems related to data features discovery, this project's
aim is to empower
NEPTUNE and VENUS users to conveniently weave
their algorithms and data sets into a data
and processing
fabric.
The data volumes from some underwater instruments -in particular if
several days or weeks
worth are requested at once- will be such that
bringing them across various GRID nodes will have
to take place
using fast computer networks, possibly through the use of User
controlled light
path (UCLP) techniques, as those have the ability
to dynamically make bandwidth available on
demand.
The present
project produces results in three key areas:
- Through the use of web services that implement the
communication with them, instruments are integrated into the
overall observatory cyber-infrastructure
- Really Simple Syndication (RSS) feed technologies are deployed
to deliver event information to subscribers (scientists or processes)
- Workflow orchestration tools are made available to facilitate
the elaboration of user driven, complex, scientific analysis
applications to be executed on GRID compute resources
The objectives of the project have been evaluated at multiple
stages. Firstly, scientific oversight of the project has been
guaranteed through the participation of a NEPTUNE and/or VENUS
scientist in the test preparation and execution, the project
communication and the use cases preparation. Secondly, an external
peer review of both the architectural and system designs has been
taking place through a Preliminary Design Review (PDR) and a critical
design review is planned for mid-March 2007.
NEPTUNE DMAS CANARIE Proposal
In here, you will find information and documents related to
the NEPTUNE DMAS CIIP-19 proposal that was implemented between
September 2005 and December 2006; in particular,
the Statement of
work (PDF format)
which describes what we intend to do.
Progress Reports
CANARIE'S funding contribution
The original contract between CANARIE and UVic on the CIIP-19 proposal
allowed us to spend up to $1.1M on this project. This money was to be
used mostly on personnel, consultancy, software and hardware. The
CANARIE contribution to this amount was to represent 75% of this
amount.
This project has in the mean time completed well within budget. The
management plan was respected and now calls for a continuation of the
project efforts with existing resources. The dissemination of the
products will be done through our web site, following the initially
promised open source approach for this software. NEPTUNE Canada is
fully funded out to the end of 2008 to develop its entire
infrastructure, including software. The part devoted to software is
sufficient to complete the objectives within the coming two years.
Additional Documentation
Of relevance to this project, please find the various slide
collections that were presented at the occasion of a preparatory
workshop. This workshop helped us define better the scope of the
project and allowed us to gather the advice of a number of
specialists. The list of participants as well as their contribution is
indicated in the following table.
Participants and their Contributions
- The University of Victoria
- UVic is hosting both the VENUS and NEPTUNE projects,
will
support the project by offering the use of its premises and
infrastructure. UVic, a leading BC
university, has over 18000
registered student and over 3000 staff and faculty. UVic commands a
yearly cash flow of over a quarter billion dollars. NEPTUNE Canada is
a consortium of 12 Canadian Universities. The UVic team will be
composed of:
- Chris Barnes is Project Director of NEPTUNE Canada since
2001. He will assist in the oversight, co-ordination
and applications of this CIIP work with other NEPTUNE
researchers.
- Verena Tunnicliffe is Project Director of the VENUS project since 2001. She will assist in the
oversight and coordination with VENUS researchers.
- Benoît Pirenne, Associate Director, IT, for NEPTUNE and
in charge of the DMAS development will act as an
overall coordinator for this project.
- IBM Canada Ltd, Markham, Ontario
-
IBM Canada is a key contributor to the Canadian
economy through
significant R&D; investment, job creation, use of Canadian suppliers
and
extensive participation in university research programs. IBM
Canada is one of the country's
largest R&D; investors, contributing
$334 million dollars in 2004. Its export revenue for the same
year
was $1.7 billion. At year end 2004, IBM Canada and its wholly-owned
subsidiaries
employed some 20,000 regular full-time and part-time
people across the country. In addition,
IBM provided temporary
employment for 2,938 people including 662 students, and we hired
1,274 regular full-time employees. In a recent KPMG / Ipsos-Reid
Survey, IBM Canada was
ranked among Canada’s Top 25 most respected
companies. IBM Global Services is the largest
information technology
services provider in Canada.
For the present project, IBM can
contribute a very significant array of expertise to supplement
the
NEPTUNE and VENUS teams knowledge in the field of Web Services,
Service Oriented
Architecture etc. through its Hursley, UK and
Rochester, MN Research Laboratories as well as
its Pacific
Development Centre in Vancouver. IBM will apply their cumulative
knowledge in the
fields of Earth and Life Sciences, petroleum
exploration, manufacturing and finance, as well as
healthcare to
further this exploratory effort.
Key personnel: Robert Heuchert
will lead the IBM participation and will be key to allocating IBM's
resources
to the project and providing advice at all stages.
- The Laboratory for the Ocean Observatory Knowledge
INtegration Grid (LOOKING project)
- The LOOKING project is a US National Science Foundation-funded
research effort into the identification, synthesis,
and assemblage
of existing and emerging concepts and technologies into a coherent
viable
cyber-infrastructure design. The goal of this effort is to
federate ocean observatories into an
integrated knowledge grid: (key
personnel: Matthew Arrott). The expected contribution of
LOOKING is
in the area of overall cyber-infrastructure architectures,
coordination with similar
initiatives south of the border as well as
in the evaluation of the project's progress.
- The Scientific Workflow Automation Technologies Laboratory, San Diego
Supercomputer Center, UCSD
- The Scientific Workflow Automation Technologies Laboratory, San Diego
Supercomputer Center, UCSD: . The pioneering work in the
area of workflows and grid interfaces will be essential to the workflow aspects of this project.
In particular, the efforts with the Kepler toolkit are expected to be of high relevance for the
present enterprise.
Key personnel: Ilkay Altintas
- The Monterey Bay Aquarium Research Institute (MBARI), Monterey, CA
- MBARI . The work presently carried out at MBARI in the area of
ontologies (eg., the Marine Metadata Interoperability project (MMI))
will be instrumental to
the success of this work in what concerns
its interoperability with other international initiative
by
providing advice on our ontology choices. Other collaborations with
MBARI are expected
in the area of streaming data analysis, which
this institute has been pioneering (e.g., the
work of Duane
Edgington).
Key personnel:John Graybeal, Luis Bermudez
- The GridX1 consortium
-
The GridX1 consortium, a Canadian computational grid, has offered to host the heavy data
analysis applications that we intend to deploy to demonstrate the instrument and archive link
with the grid. All of the GridX1 resource centres are linked with CANARIE and make use of
the Globus toolkit.
Key personnel: Dr. Randall Sobie, UVic).
- ORAN
-
ORAN with which we will be interacting is BCNET's. The project has been
discussed with Mike Hrybyk, President. BCNET will assist this project's team in the
preparation of the data transport requirements both from the instruments and the related
shore station (Sidney, BC and Port Alberni, BC) as well as from the University.
|
|