1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304
|
.. doctest-skip-all
Astroquery API Specification
============================
Service Class
-------------
The query tools will be implemented as class methods, so that the standard
approach for a given web service (e.g., IRSA, UKIDSS, SIMBAD) will be
.. code-block:: python
from astroquery.service import Service
result = Service.query_object('M 31')
for services that do not require login, and
.. code-block:: python
from astroquery.service import Service
S = Service(user='username',password='password')
result = S.query_object('M 31')
for services that do.
Query Methods
~~~~~~~~~~~~~
The classes will have the following methods where appropriate:
.. code-block:: python
query_object(objectname, ...)
query_region(coordinate, radius=, width=)
get_images(coordinate)
They may also have other methods for querying non-standard data types
(e.g., ADS queries that may return a ``bibtex`` text block).
query_object
````````````
``query_object`` is only needed for services that are capable of parsing an
object name (e.g., SIMBAD, Vizier, NED), otherwise ``query_region`` is an
adequate approach, as any name can be converted to a coordinate via the SIMBAD
name parser.
query_region
````````````
Query a region around a coordinate.
One of these keywords *must* be specified (no default is assumed)::
radius - an astropy Quantity object, or a string that can be parsed into one.
e.g., '1 degree' or 1*u.degree.
If radius is specified, the shape is assumed to be a circle
width - a Quantity. Specifies the edge length of a square box
height - a Quantity. Specifies the height of a rectangular box. Must be passed with width.
Returns a `~astropy.table.Table`.
get_images
``````````
Perform a coordinate-based query to acquire images.
Returns a list of `~astropy.io.fits.HDUList` objects.
Shape keywords are optional - some query services allow searches for images
that overlap with a specified coordinate.
(query)_async
`````````````
Includes ``get_images_async``, ``query_region_async``, ``query_object_async``
Same as the above query tools, but returns a list of readable file objects instead of a parsed
object so that the data is not downloaded until ``result.get_data()`` is run.
Common Keywords
```````````````
These keywords are common to all query methods::
return_query_payload - Return the POST data that will be submitted as a dictionary
savename - [optional - see discussion below] File path to save the downloaded query to
timeout - timeout in seconds
.. _api_async_queries:
Asynchronous Queries
--------------------
Some services require asynchronous query submission & download, e.g. Besancon,
the NRAO Archive, the Fermi archive, etc. The data needs to be "staged" on the
remote server before it can be downloaded. For these queries, the approach is
.. code-block:: python
result = Service.query_region_async(coordinate)
data = result.get_data()
# this will periodically check whether the data is available at the specified URL
Additionally, any service can be queried asynchronously - ``get_images_async``
will return readable objects that can be downloaded at a later time.
Outline of an Example Module
----------------------------
Directory Structure::
module/
module/__init__.py
module/core.py
module/tests/test_module.py
``__init__.py`` contains:
.. code-block:: python
from astropy import config as _config
class Conf(_config.ConfigNamespace):
"""
Configuration parameters for `astroquery.template_module`.
"""
server = _config.ConfigItem(
['http://dummy_server_mirror_1',
'http://dummy_server_mirror_2',
'http://dummy_server_mirror_n'],
'Name of the template_module server to use.'
)
timeout = _config.ConfigItem(
30,
'Time limit for connecting to template_module server.'
)
conf = Conf()
from .core import QueryClass
__all__ = ['QueryClass', 'conf']
``core.py`` contains:
.. code-block:: python
from ..utils.class_or_instance import class_or_instance
from ..utils import async_to_sync
from . import conf
__all__ = ['QueryClass'] # specifies what to import
@async_to_sync
class QueryClass(astroquery.query.BaseQuery):
server = conf.server
def __init__(self, *args):
""" set some parameters """
# do login here
pass
@class_or_instance
def query_region_async(self, *args, get_query_payload=False):
request_payload = self._args_to_payload(*args)
response = self._request(method="POST", url=self.server,
data=request_payload, timeout=TIMEOUT)
# primarily for debug purposes, but also useful if you want to send
# someone a URL linking directly to the data
if get_query_payload:
return request_payload
return response
@class_or_instance
def get_images_async(self, *args):
image_urls = self.get_image_list(*args)
return [get_readable_fileobj(U) for U in image_urls]
# get_readable_fileobj returns need a "get_data()" method?
@class_or_instance
def get_image_list(self, *args):
request_payload = self._args_to_payload(*args)
response = self._request(method="POST", url=self.server,
data=request_payload, timeout=TIMEOUT)
return self.extract_image_urls(response.text)
def _parse_result(self, result):
# do something, probably with regexp's
return astropy.table.Table(tabular_data)
def _args_to_payload(self, *args):
# convert arguments to a valid requests payload
return dict
Parallel Queries
----------------
For multiple parallel queries logged in to the same object, you could do:
.. code-block:: python
from astroquery.module import QueryClass
QC = QueryClass(login_information)
results = parallel_map(QC.query_object, ['m31', 'm51', 'm17'],
radius=['1"', '1"', '1"'])
results = [QC.query_object_async(obj, radius=r)
for obj,r in zip(['m31', 'm51', 'm17'], ['1"', '1"', '1"'])]
Here ``parallel_map()`` is a parallel implementation of some map function.
.. TODO::
Include a ``parallel_map`` function in ``astroquery.utils``
Exceptions
----------
* What errors should be thrown if queries fail?
Failed queries should raise a custom Exception that will include the full
html (or xml) of the failure, but where possible should parse the web page's
error message into something useful.
* How should timeouts be handled?
Timeouts should raise a ``TimeoutError``.
Examples
--------
Standard usage should be along these lines:
.. code-block:: python
from astroquery.simbad import Simbad
result = Simbad.query_object("M 31")
# returns astropy.Table object
from astroquery.irsa import Irsa
images = Irsa.get_images("M 31","5 arcmin")
# searches for images in a 5-arcminute circle around M 31
# returns list of HDU objects
images = Irsa.get_images("M 31")
# searches for images overlapping with the SIMBAD position of M 31, if supported by the service?
# returns list of HDU objects
from astroquery.ukidss import Ukidss
Ukidss.login(username, password)
result = Ukidss.query_region("5.0 0.0 gal", catalog='GPS')
# FAILS: no radius specified!
result = Ukidss.query_region("5.0 0.0 gal", catalog='GPS', radius=1)
# FAILS: no assumed units!
result = Ukidss.query_region("5.0 0.0 gal", catalog='GPS', radius='1 arcmin')
# SUCCEEDS! returns an astropy.Table
from astropy.coordinates import SkyCoord
import astropy.units as u
result = Ukidss.query_region(
SkyCoord(5,0,unit=('deg','deg'), frame='galactic'),
catalog='GPS', region='circle', radius=5*u.arcmin)
# returns an astropy.Table
from astroquery.nist import Nist
hydrogen = Nist.query(4000*u.AA, 7000*u.AA, linename='H I', energy_level_unit='eV')
# returns an astropy.Table
For tools in which multiple catalogs can be queried, e.g. as in the UKIDSS
examples, they must be specified. There should also be a ``list_catalogs``
function that returns a ``list`` of catalog name strings:
.. code-block:: python
print(Ukidss.list_catalogs())
Unparseable Data
~~~~~~~~~~~~~~~~
If data cannot be parsed into its expected form (`~astropy.table.Table`, `astropy.io.fits.PrimaryHDU`),
the raw unparsed data will be returned and a ``Warning`` issued.
|