![]() |
ERDDAP
Easier access to scientific data |
log in | ![]() TRANSLATION DISCLAIMER THIS SERVICE MAY CONTAIN TRANSLATIONS POWERED BY GOOGLE. GOOGLE DISCLAIMS ALL WARRANTIES RELATED TO THE TRANSLATIONS, EXPRESS OR IMPLIED, INCLUDING ANY WARRANTIES OF ACCURACY, RELIABILITY, AND ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. The ERDDAP website has been translated for your convenience using translation software powered by Google Translate. Reasonable efforts have been made to provide an accurate translation, however, no automated translation is perfect nor is it intended to replace human translators. Translations are provided as a service to users of the ERDDAP website, and are provided "as is." No warranty of any kind, either expressed or implied, is made as to the accuracy, reliability, or correctness of any translations made from English into any other language. Some content (such as images, videos, etc.) may not be accurately translated due to the limitations of the translation software. The official text is the English version of the website. Any discrepancies or differences created in the translation are not binding and have no legal effect for compliance or enforcement purposes. If any questions arise related to the accuracy of the information contained in the translated website, refer to the English version of the website which is the official version.')" onmouseout="UnTip()" > Brought to you by NOAA NMFS SWFSC ERD |
ERDDAP has a utility to
Convert a Numeric Time to/from a String Time.
For more details, see
How ERDDAP Deals with Time.
Because the longitude, latitude, altitude, and time variables are specifically recognized, ERDDAP is aware of the geo/temporal features of each dataset. This is useful when making images with maps or time-series, and when saving data in geo-referenced file types (e.g., .esriAscii, .geoJson, and .kml).
Two common standards for writing units of measure are:
Email/URL Subscriptions
(not available at some ERDDAP installations)
Whenever a dataset changes, the email/URL subscription system will immediately
send you an email or contact a URL that you specify.
Email/URL subscriptions are not available at some ERDDAP installations.
To set up an email/URL subscription, click on one of the envelope icons
that appear at the far right on ERDDAP web pages with lists of datasets
(example)
and on the Data Access Forms and Make A Graph web pages for individual datasets
(example)
if this ERDDAP installation supports email/URL subscriptions.
(Computer programmers: if you write web services, you can use the URL system
to have ERDDAP notify your web service immediately whenever a dataset changes.)
RSS
Subscriptions
RSS is standard system for notifying users when the content at a website has changed.
Modern web browsers have an RSS client built in or you can use a separate
RSS Reader
.
ERDDAP offers a separate RSS 2.01 feed for each dataset so that you can find out
when interesting datasets have changed.
To subscribe to a dataset's RSS feed, click on one of the RSS icons
that appear at the far right on ERDDAP web pages with lists of datasets
(example)
or on the Data Access Forms and Make A Graph web pages for individual datasets
(example).
Comparison
The RSS service may be just what you are looking for. It is a nice standard.
But if you need to know as soon as possible when a dataset changes, use the
email/URL system, not RSS. RSS clients periodically (every hour?) request and read
the RSS XML document to look for changes.
So typically, an RSS client will not detect a change to a dataset quickly (average 30 minutes?).
In contrast, the email/URL subscription system acts immediately whenever ERDDAP detects
a change to a dataset.
The more proactive approach of the email/URL system is also much more efficient:
You may be able to set your RSS client to check for changes every minute (don't do it!),
but that would just lead to lots of unnecessary requests to the ERDDAP server
and it still wouldn't detect changes immediately.
This architecture puts each ERDDAP administrator in charge of determining where the data
for his/her ERDDAP comes from. Other ERDDAP administrators can do the same.
There is no need for coordination between administrators.
If many ERDDAP administrators link to each other's ERDDAPs,
a data distribution network is formed.
Data will be quickly, efficiently, and automatically disseminated from data sources
(ERDDAPs and other servers) to data redistribution sites (ERDDAPs) anywhere in the network.
A given ERDDAP can be both a source of data for some datasets and a redistribution site
for other datasets.
The resulting network is roughly similar to data distribution networks set up with programs
like
Unidata's IDD/IDM,
but less rigidly structured.
DAP? OPeNDAP? DODS? ERDDAP? What's the difference? My (Bob's) understanding is:
DODS (Distributed Oceanographic Data System) was created in the 1990's, before there was http: (!). The DODS system created and used the dods: protocol on the Internet. When HTTP came along and was so successful, they switched from dods: to http:.
At some point, they realized the system was useful for more than just oceanographic data.
So they ditched that DODS name (although it lives on in some code),
formed a small organization called
OPeNDAP
and wrote the
DAP (Data Access Protocol) specification
,
which standardizes the format of the requests for metadata and/or data,
and the responses with the metadata and/or data.
OPeNDAP (the organization) still shepherds DAP (the specification) and is the author of Hyrax (the data server which
is often mistakenly referred to as OPeNDAP).
Hyrax, THREDDS, GRADS, ERDDAP and others are data servers (software) which implement DAP. They each implement a subset of DAP but do other things very differently.
ERDDAP uses code (in the "dods" directory) (actually written by Jake Hamby at NASA JPL)
for some features of reading data from external DAP servers.
ERDDAP uses its own code to write out DAP responses.
Is ERDDAP a solution to everyone's data distribution / data access problems?
No. ERDDAP tries to find a sweet spot that is a really good solution to most of the
data distribution problems that we confronted.
ERDDAP takes a middleware approach:
It can get data from lots of different types of remote data servers
and it can give that data to clients in lots of different file formats.
It is designed to be an agnostic solution which seeks to make other data servers
(OPeNDAP, SOS, OBIS, WMS, ...) interoperable.
Is there one perfect data server that meets everyone's needs perfectly? We don't think so.
And even if you think there is or will be, it will be a long time before everyone switches
to it, if ever. Until then, ERDDAP is available right now to make other data servers
interoperable and to serve data right now.
ERDDAP can handle many/most datasets as is, but not all. It isn't that the remaining datasets (e.g., model data using a cubed sphere projection) aren't important. It's just that ERDDAP's goal of returning data in common file formats (some of which are pretty simple), precludes a more complex internal data structure. Groups of researchers working with more complex data structures often already have specialized data servers and specialized client software which are customized to their community's needs. ERDDAP, as a general purpose data server, doesn't try to compete with these specialized data servers. They are customized to the needs of their community and do a great job. However, those datasets are often only "understood" by the specialized software in that community.
A Work-Around for Complex Datasets - ERDDAP has a way to handle complex datasets that it
can't handle directly. Just as a
relational database
can store a complex dataset by using just
one simple data structure (a table), ERDDAP can serve the data from more complex datasets by
breaking the source dataset into a few ERDDAP datasets, each with similar, simple data structures.
For example, some gridded environmental model datasets can be stored in ERDDAP by
putting the sea surface variables ([time][latitude][longitude]) in one ERDDAP dataset,
and by putting the variables with altitude ([time][altitude][latitude][longitude])
in another ERDDAP dataset. We know this isn't ideal, but it is necessary to allow ERDDAP
to return data in common file formats (some of which are pretty simple).
Another approach to dealing with complex datasets (e.g., for model data using a cubed
sphere projection) is to also offer a reprojected version of the dataset
([time][altitude][latitude][longitude]) which ERDDAP can work with easily.
These simpler data structures aren't meant to replace the original data structures,
but they can be a useful way to distribute the data to a wider audience.
How sustainable is the ERDDAP project?
ERDDAP is very sustainable.
Some people are surprised and disappointed to hear that ERDDAP is mostly
developed by one person (was Bob Simons, now Chris John).
[By the way, the opinions on this web page are my personal opinions and
do not necessarily reflect any position of the Government or the National Oceanic and Atmospheric Administration.]
They fear that if something happens
to me, that will be end of ERDDAP. That is simply not true.
ERDDAP's positioning for long-term sustainability is excellent,
and close to the best it could possibly be.
Yes, I am the main developer of ERDDAP. I am a fully funded federal employee. My funding isn't "soft" money, so I don't receive or rely on grants. I spend more than half my time developing ERDDAP. The rest of my time is spent managing datasets. That work is useful for ERDDAP because I need to work with real datasets in order to know in detail what ERDDAP needs to do. My bosses fully support my work on ERDDAP because it does what I was hired to do: make it easier for fisheries scientists (primarily, but really everyone) to get scientific data from diverse sources.
The miraculous thing about software is that it costs nothing to duplicate. So to do my job, I write ERDDAP for use at ERD. I think that is the best possible way for me to do my job. That reason alone justifies the expense of developing ERDDAP. (I think it could be shown that ERDDAP has saved more NOAA scientist's time than that I have spent developing ERDDAP. Time=Money.) But the side benefit is that any other organization can download, install, and use ERDDAP for free to distribute their scientific data.
Over 90 organizations in at least 14 countries use ERDDAP. Maybe there is such a thing as a free lunch.
ERDDAP is a Java program. The source code
for every version is on
GitHub,
the most commonly used system for collaborative
software projects.
Credits
ERDDAP credits are now available on a separate page.
I hope others will contribute code in the future.
If something happens to me, my bosses will hire a replacement with the specific goal that
s/he continues the development of ERDDAP.
Further, I try to write very clean code. I write Java Doc comments. I write
comments in the code. I chose variable names carefully. I follow the Java formatting guidelines.
All of this is an effort
to make the code more readable, for other programmers who want to understand
and/or change it, and for me, because, in a year or two, I will have forgotten
the details of how and why the code was written the way it was.
Clean code with good comments makes my ongoing work on ERDDAP easier, so I have a
great incentive to write clean code with good comments.
But all of my answers so far are not very important.
Only one thing that is really important. Only one thing guarantees sustainability
for ERDDAP or any software project: that ERDDAP is
Free and Open Source Software (FOSS).
Specifically, ERDDAP uses
Apache-compatible software licenses
,
so anyone can do anything they want with the code.
Why is that important? One might think that software will be reliably available
in the future because a big
company is behind it. But Google, for example, has discontinued numerous projects
(here's a list).
I don't want to pick on Google because I really like Google and they
fund a large number of great, open-source projects. Microsoft has
discontinued projects. Apple has discontinued projects. ...
The point is that just having the backing of a large company is no assurance
that the project will continue.
The users of that software are out of luck,
unless the software was (and therefore, always is) Free and Open Source Software
(FOSS). Then, whenever there is interest by even one developer, the project can and will
continue to evolve. FOSS is an insurance policy. In fact, FOSS is the only
insurance policy, the only assurance, that matters. FOSS insures that there
is always a way forward for the software. That is a right that no one can
take away, ever.
One might also think that software that has a large team of developers will be more sustainable than software with one main developer. But lots of developers usually need lots of funding. I know a famous, reasonably large project with 10 developers (I won't embarrass them by naming them) that is in constant serious danger of stopping the project because they don't have enough funding. They rely on grants. They always run a deficit. Their patron has always bailed them out at the last minute, but is getting really tired of bailing them out. So if they can't raise a million dollars a year in grant money (or the patron gets too tired of bailing them out), they'll stop. And the group can't conceive of having fewer than 10 developers. Each developer has a role to play in their group. In light of that, it seems to me that it is a great sign that ERDDAP can be, and is, actively developed by just one main developer (who is fully funded) with the unofficial assistance of a few others. If fact, it would be a bad sign if ERDDAP required multiple developers. That ERDDAP has just one main developer means that it isn't a huge task that requires massive ongoing funding; it is a relatively small task that requires minimal effort and funding. That is more sustainable, not less.
One might think that hiring a contracting company to write software is a good idea. For a fee, they'll provide developers and promise continuity (which is good unless/until they go out of business). But they also have you over a barrel: you must pay them what they request or there is no more development, unless the software is FOSS and you're just paying them to work on the code. With FOSS, you always have choices about how to move forward. Because ERDDAP is FOSS, contractors are always a good option for you or anyone with regard to ERDDAP: if anything happens to me (the one main developer), or if I don't have time to make some change that you want, or I retire and you don't like my replacement's work, you can always hire a contracting company to make the changes you want (or make them yourself).
In summary, ERDDAP has the two sustainability features that matter most:
How to Cite a Dataset in a Paper
It is important to let readers know how you got the data that you used in your paper.
For each dataset that you used, please look at the dataset's metadata in the
Dataset Attribute Structure section at the bottom of the .html page
for the dataset, e.g.,
http://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.html .
The metadata sometimes includes a required or suggested citation format for the dataset.
The "license" metadata sometimes lists restrictions on the use of the data.
To generate a citation for a dataset:
If you think of the dataset as a scientific article, you can generate a citation
based on the author (see the "creator_name" or "institution" metadata),
the date that you downloaded the data, the title (see the "title" metadata),
and the publisher (see the "publisher_name" metadata).
If possible, please include the specific URL(s) used to download the data.
If the dataset's metadata includes a
Digital Object Identifier (DOI),
please include that in the citation you create.
How to Cite ERDDAP in a Paper
If you want to cite ERDDAP itself in a scientific paper, please use something like
Simons, R.A., and Chris John. 2022. ERDDAP. http://coastwatch.pfeg.noaa.gov/erddap . Monterey, CA: NOAA/NMFS/SWFSC/ERD.
What does the acronym "ERDDAP" stand for?
"ERDDAP" used to be an acronym, but it outgrew that original description.
Now, please just think of it as a name, not an acronym.
Guidelines for Data Distribution Systems
Bob's opinions about the design
and evaluation of data distribution systems can be found
here.
You can Set Up Your Own ERDDAP Server and serve your own data.
DISCLAIMER: The opinions on this web page are Bob Simons' personal opinions and do not necessarily reflect any position of the Government or the National Oceanic and Atmospheric Administration.