1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129
|
Introduction
------------
Extract basic provenance information from VOTable header. The information is described in
DataOrigin IVOA note: https://www.ivoa.net/documents/DataOrigin/.
DataOrigin includes both the query information (such as publisher, contact, versions, etc.)
and the Dataset origin (such as Creator, bibliographic links, URL, etc.)
This API retrieves Metadata from INFO in VOTable.
Getting Started
---------------
For the following example, we would first reconstruct a VOTable DataOrigin based on a query to
VizieR catalogue J/AJ/167/18. In practice, you would obtain this table directly from
the VO service of interest::
>>> from astropy.io.votable.dataorigin import add_data_origin_info
>>> from astropy.io.votable.tree import VOTableFile
>>> from astropy.table import Column, Table
>>> # For this example, the table data itself is irrelevant.
>>> table = Table([
... Column(name="id", data=[1, 2, 3, 4]),
... Column(name="bmag", unit="mag", data=[5.6, 7.9, 12.4, 11.3])])
>>> votable = VOTableFile().from_table(table)
>>> votable.description = "Period variations of 32 contact binaries (Hong+, 2024)"
>>> # Order is important here for the example.
>>> add_data_origin_info(votable, "ivoid", "ivo://cds.vizier/j/aj/167/18",
... content="IVOID of underlying data collection")
>>> add_data_origin_info(votable, "creator", "Hong K.",
... content="First author or institution")
>>> add_data_origin_info(votable, "cites", "bibcode:2024AJ....167...18H",
... content="Article or Data origin sources")
>>> add_data_origin_info(votable, "editor", "Astronomical Journal (AAS)",
... content="Editor name (article)")
>>> add_data_origin_info(votable, "original_date", "2024",
... content="Year of the article publication")
>>> # The rest in alphabetical order.
>>> add_data_origin_info(votable, "citation", "doi:10.26093/cds/vizier.51670018")
>>> add_data_origin_info(votable, "contact", "cds-question@unistra.fr")
>>> add_data_origin_info(votable, "publication_date", "2024-11-06")
>>> add_data_origin_info(votable, "publisher", "CDS")
>>> add_data_origin_info(votable, "reference_url", "https://cdsarc.cds.unistra.fr/viz-bin/cat/J/AJ/167/18")
>>> add_data_origin_info(votable, "request_date", "2025-03-05T14:18:05")
>>> add_data_origin_info(votable, "rights_uri", "https://cds.unistra.fr/vizier-org/licences_vizier.html")
>>> add_data_origin_info(votable, "server_software", "7.4.5")
>>> add_data_origin_info(votable, "service_protocol", "ivo://ivoa.net/std/ConeSearch/v1.03")
To extract DataOrigin from VOTable::
>>> from astropy.io.votable.dataorigin import extract_data_origin
>>> data_origin = extract_data_origin(votable)
>>> print(data_origin)
publisher: CDS
server_software: 7.4.5
service_protocol: ivo://ivoa.net/std/ConeSearch/v1.03
request_date: 2025-03-05T14:18:05
contact: cds-question@unistra.fr
<BLANKLINE>
ivoid: ivo://cds.vizier/j/aj/167/18
citation: doi:10.26093/cds/vizier.51670018
reference_url: https://cdsarc.cds.unistra.fr/viz-bin/cat/J/AJ/167/18
rights_uri: https://cds.unistra.fr/vizier-org/licences_vizier.html
creator: Hong K.
editor: Astronomical Journal (AAS)
cites: bibcode:2024AJ....167...18H
original_date: 2024
publication_date: 2024-11-06
Contents and metadata
---------------------
`astropy.io.votable.dataorigin.extract_data_origin` returns a `astropy.io.votable.dataorigin.DataOrigin` (class) container which is made of:
* a `astropy.io.votable.dataorigin.QueryOrigin` (class) container describing the request.
``QueryOrigin`` is considered to be unique for the whole VOTable.
It includes metadata like the publisher, the contact, date of execution, query, etc.
* a list of `astropy.io.votable.dataorigin.DatasetOrigin` (class) container for each Element having DataOrigin information.
``DataSetOrigin`` is a basic provenance of the datasets queried. Each attribute is a list.
It includes metadata like authors, ivoid, landing pages, ....
Examples
--------
Get the (Data Center) publisher and the Creator of the dataset::
>>> print(data_origin.query.publisher)
CDS
>>> print(data_origin.origin[0].creator)
['Hong K.']
Other capabilities
------------------
DataOrigin container includes VO Elements:
* Extract list of `astropy.io.votable.tree.Info`:
>>> # get DataOrigin with the description of each INFO
>>> for dataset_origin in data_origin.origin:
... for info in dataset_origin.infos:
... print(f"{info.name}: {info.value} ({info.content})")
ivoid: ivo://cds.vizier/j/aj/167/18 (IVOID of underlying data collection)
creator: Hong K. (First author or institution)
cites: bibcode:2024AJ....167...18H (Article or Data origin sources)
editor: Astronomical Journal (AAS) (Editor name (article))
original_date: 2024 (Year of the article publication)
...
* Extract tree node `astropy.io.votable.tree.Element`;
The following example extracts the citation from the header (in APA style):
>>> # get the Title retrieved in Element
>>> origin = data_origin.origin[0]
>>> vo_elt = origin.get_votable_element()
>>> title = vo_elt.description if vo_elt else ""
>>> print(f"APA: {','.join(origin.creator)} ({origin.publication_date[0]}). {title} [Dataset]. {data_origin.query.publisher}. {origin.citation[0]}")
APA: Hong K. (2024-11-06). Period variations of 32 contact binaries (Hong+, 2024) [Dataset]. CDS. doi:10.26093/cds/vizier.51670018
* Add Data Origin INFO into VOTable:
>>> from astropy.io.votable import dataorigin
>>> dataorigin.add_data_origin_info(votable, "query", "Data center name")
>>> dataorigin.add_data_origin_info(votable.resources[0], "creator", "Author name")
|