ASCII Tables (astropy.io.ascii
)¶
Introduction¶
astropy.io.ascii
provides methods for reading and writing a wide range of ASCII data table
formats via built-in Extension Reader classes. The emphasis is on flexibility and ease of use,
although readers can optionally use a less flexible C/Cython engine for reading and writing for
improved performance.
The following shows a few of the ASCII formats that are available, while the section on Supported formats contains the full list.
Basic
: basic table with customizable delimiters and header configurationsCds
: CDS format table (also Vizier and ApJ machine readable tables)Daophot
: table from the IRAF DAOphot packageEcsv
: ECSV format for lossless round-trip of data tablesFixedWidth
: table with fixed-width columns (see also Fixed-width Gallery)Ipac
: IPAC format tableHTML
: HTML format table contained in a <table> tagLatex
: LaTeX table with datavalue in thetabular
environmentRdb
: tab-separated values with an extra line after the column definition lineSExtractor
: SExtractor format table
The astropy.io.ascii
package is built on a modular and extensible class
structure with independent Base class elements so that new formats can
be easily accommodated.
Note
It is also possible (and encouraged) to use the functionality from
astropy.io.ascii
through a higher-level interface in the
Data Tables package. See Unified file read/write interface for more details.
Getting Started¶
Reading Tables¶
The majority of commonly encountered ASCII tables can be easily read with the read()
function. Assume you have a file named sources.dat
with the following contents:
obsid redshift X Y object
3102 0.32 4167 4085 Q1250+568-A
877 0.22 4378 3892 "Source 82"
This table can be read with the following:
>>> from astropy.io import ascii
>>> data = ascii.read("sources.dat")
>>> print(data)
obsid redshift X Y object
----- -------- ---- ---- -----------
3102 0.32 4167 4085 Q1250+568-A
877 0.22 4378 3892 Source 82
The first argument to the read()
function can be the name of a file, a string
representation of a table, or a list of table lines. The return value
(data
in this case) is a Table object.
By default read()
will try to guess the table format
by trying all the supported formats. If this does not work (for unusually
formatted tables) then one needs give astropy.io.ascii
additional hints about
the format, for example:
>>> lines = ['objID & osrcid & xsrcid ',
... '----------------------- & ----------------- & -------------',
... ' 277955213 & S000.7044P00.7513 & XS04861B6_005',
... ' 889974380 & S002.9051P14.7003 & XS03957B7_004']
>>> data = ascii.read(lines, data_start=2, delimiter='&')
>>> print(data)
objID osrcid xsrcid
--------- ----------------- -------------
277955213 S000.7044P00.7513 XS04861B6_005
889974380 S002.9051P14.7003 XS03957B7_004
If the format of a file is known (e.g. it is a fixed width table or an IPAC table),
then it is more efficient and reliable to provide a value for the format
argument from one
of the values in the supported formats. For example:
>>> data = ascii.read(lines, format='fixed_width_two_line', delimiter='&')
For simpler formats such as CSV, read()
will automatically try reading with the
Cython/C parsing engine, which is significantly faster than the ordinary Python
implementation (described in Fast ASCII I/O). If the fast engine fails,
read()
will fall back on the Python reader by default. The argument
fast_reader
can be specified to control this behavior. For example, to
disable the fast engine:
>>> data = ascii.read(lines, format='csv', fast_reader=False)
For reading very large tables see the section on Reading large tables in chunks.
Note
Reading a table which contains unicode characters is supported; if you need
a different encoding, you can specify the encoding
parameter in the
pure-Python readers.
Writing Tables¶
The write()
function provides a way to write a data table as a formatted ASCII
table. For example the following writes a table as a simple space-delimited
file:
>>> import numpy as np
>>> from astropy.table import Table, Column, MaskedColumn
>>> x = np.array([1, 2, 3])
>>> y = x ** 2
>>> data = Table([x, y], names=['x', 'y'])
>>> ascii.write(data, 'values.dat')
The values.dat
file will then contain:
x y
1 1
2 4
3 9
Most of the input Reader formats supported by astropy.io.ascii
for reading are
also supported for writing. This provides a great deal of flexibility in the
format for writing. The example below writes the data as a LaTeX table, using
the option to send the output to sys.stdout
instead of a file:
>>> import sys
>>> ascii.write(data, sys.stdout, format='latex')
\begin{table}
\begin{tabular}{cc}
x & y \\
1 & 1 \\
2 & 4 \\
3 & 9 \\
\end{tabular}
\end{table}
There is also a faster Cython engine for writing simple formats,
which is enabled by default for these formats (see Fast ASCII I/O).
To disable this engine, use the parameter fast_writer
:
>>> ascii.write(data, 'values.csv', format='csv', fast_writer=False)
Finally, one can write data in the ECSV format which allows preserving table meta-data such as column data types and units. In this way a data table (including one with masked entries) can be stored and read back as ASCII with no loss of information.
>>> t = Table(masked=True)
>>> t['x'] = MaskedColumn([1.0, 2.0], unit='m', dtype='float32')
>>> t['x'][1] = np.ma.masked
>>> t['y'] = MaskedColumn([False, True], dtype='bool')
>>> import io
>>> fh = io.StringIO()
>>> t.write(fh, format='ascii.ecsv') # doctest: +SKIP
>>> table_string = fh.getvalue() # doctest: +SKIP
>>> print(table_string) # doctest: +SKIP
# %ECSV 0.9
# ---
# datatype:
# - {name: x, unit: m, datatype: float32}
# - {name: y, datatype: bool}
x y
1.0 False
"" True
>>> Table.read(table_string, format='ascii') # doctest: +SKIP
<Table masked=True length=2>
x y
m
float32 bool
------- -----
1.0 False
-- True
Note
For most supported formats one can write a masked table and then read it back without losing information about the masked table entries. This is accomplished by using a blank string entry to indicate a masked (missing) value. See the Bad or missing values section for more information.
Supported formats¶
A full list of the supported format
values and corresponding format types for ASCII
tables is given below. The Write
column indicates which formats support write
functionality, and the Fast
column indicates which formats are compatible with
the fast Cython/C engine for reading and writing.
Format | Write | Fast | Description |
---|---|---|---|
aastex |
Yes | AASTex : AASTeX deluxetable used for AAS journals |
|
basic |
Yes | Yes | Basic : Basic table with custom delimiters |
cds |
Cds : CDS format table |
||
commented_header |
Yes | Yes | CommentedHeader : Column names in a commented line |
csv |
Yes | Yes | Csv : Basic table with comma-separated values |
daophot |
Daophot : IRAF DAOphot format table |
||
ecsv |
Yes | Ecsv : Enhanced CSV format |
|
fixed_width |
Yes | FixedWidth : Fixed width |
|
fixed_width_no_header |
Yes | FixedWidthNoHeader : Fixed width with no header |
|
fixed_width_two_line |
Yes | FixedWidthTwoLine : Fixed width with second header line |
|
html |
Yes | HTML : HTML format table |
|
ipac |
Yes | Ipac : IPAC format table |
|
latex |
Yes | Latex : LaTeX table |
|
no_header |
Yes | Yes | NoHeader : Basic table with no headers |
rdb |
Yes | Yes | Rdb : Tab-separated with a type definition header line |
rst |
Yes | RST : reStructuredText simple format table |
|
sextractor |
SExtractor : SExtractor format table |
||
tab |
Yes | Yes | Tab : Basic table with tab-separated values |
Attention
ECSV is recommended
For writing and reading tables to ASCII in a way that fully reproduces the
table data, types and metadata (i.e. the table will “round-trip”), we highly
recommend using the ECSV format. This writes the actual data in a
simple space-delimited format (the basic
format) that any ASCII table
reader can parse, but also includes metadata encoded in a comment block that
allows full reconstruction of the original columns. This includes support
for Mixin columns (such as
SkyCoord
or Time
) and
Masked columns.
Using astropy.io.ascii
¶
The details of using astropy.io.ascii
are provided in the following sections:
Reading tables¶
Writing tables¶
Fixed-width Gallery¶
Fast ASCII Engine¶
Base class elements¶
Extension Reader classes¶
Performance Tips¶
By default, when trying to read a file, the reader will guess the format, which involves trying to read it with many different readers. Especially when dealing with large tables, it is much better performance-wise if you can specify the format and any options explicitly, and also turn off guessing. For example, if you are reading a simple CSV file with a one-line header with column names, the following:
read('example.csv', format='basic', delimiter=',', guess=False) # doctest: +SKIP
can be at least an order of magnitude faster than:
read('example.csv') # doctest: +SKIP
Reference/API¶
astropy.io.ascii Package¶
An extensible ASCII table reader and writer.
Functions¶
convert_numpy (numpy_type) |
Return a tuple containing a function which converts a list into a numpy array and the type produced by the converter function. |
get_read_trace () |
Return a traceback of the attempted read formats for the last call to read where guessing was enabled. |
get_reader ([Reader, Inputter, Outputter]) |
Initialize a table reader allowing for common customizations. |
get_writer ([Writer, fast_writer]) |
Initialize a table writer allowing for common customizations. |
read (table[, guess]) |
Read the input table and return the table. |
set_guess (guess) |
Set the default value of the guess parameter for read() |
write (table[, output, format, Writer, …]) |
Write the input table to filename . |
Classes¶
AASTex (**kwargs) |
Write and read AASTeX tables. |
AllType |
Subclass of all other data types. |
BaseData () |
Base table data reader. |
BaseHeader () |
Base table header reader |
BaseInputter |
Get the lines from the table input and return a list of lines. |
BaseOutputter |
Output table as a dict of column objects keyed on column name. |
BaseReader () |
Class providing methods to read and write an ASCII table using the specified header, data, inputter, and outputter instances. |
BaseSplitter |
Base splitter that uses python’s split method to do the work. |
Basic () |
Read a character-delimited table with a single header line at the top followed by data lines to the end of the table. |
BasicData () |
Basic table Data Reader |
BasicHeader () |
Basic table Header Reader |
Cds ([readme]) |
Read a CDS format table. |
Column (name) |
Table column. |
CommentedHeader () |
Read a file where the column names are given in a line that begins with the header comment character. |
ContinuationLinesInputter |
Inputter where lines ending in continuation_char are joined with the subsequent line. |
Csv () |
Read a CSV (comma-separated-values) file. |
Daophot () |
Read a DAOphot file. |
DefaultSplitter |
Default class to split strings into columns using python csv. |
Ecsv () |
Read a file which conforms to the ECSV (Enhanced Character Separated Values) format. |
FastBasic ([default_kwargs]) |
This class is intended to handle the same format addressed by the ordinary Basic writer, but it acts as a wrapper for underlying C code and is therefore much faster. |
FastCommentedHeader (**kwargs) |
A faster version of the CommentedHeader reader, which looks for column names in a commented line. |
FastCsv (**kwargs) |
A faster version of the ordinary Csv writer that uses the optimized C parsing engine. |
FastNoHeader (**kwargs) |
This class uses the fast C engine to read tables with no header line. |
FastRdb (**kwargs) |
A faster version of the Rdb reader. |
FastTab (**kwargs) |
A faster version of the ordinary Tab reader that uses the optimized C parsing engine. |
FixedWidth ([col_starts, col_ends, …]) |
Read or write a fixed width table with a single header line that defines column names and positions. |
FixedWidthData () |
Base table data reader. |
FixedWidthHeader () |
Fixed width table header reader. |
FixedWidthNoHeader ([col_starts, col_ends, …]) |
Read or write a fixed width table which has no header line. |
FixedWidthSplitter |
Split line based on fixed start and end positions for each col in self.cols . |
FixedWidthTwoLine ([position_line, …]) |
Read or write a fixed width table which has two header lines. |
FloatType |
Describes floating-point data. |
HTML ([htmldict]) |
Read and write HTML tables. |
InconsistentTableError |
Indicates that an input table is inconsistent in some way. |
IntType |
Describes integer data. |
Ipac ([definition, DBMS]) |
Read or write an IPAC format table. |
Latex ([ignore_latex_commands, latexdict, …]) |
Write and read LaTeX tables. |
NoHeader () |
Read a table with no header line. |
NoType |
Superclass for StrType and NumType classes. |
NumType |
Indicates that a column consists of numerical data. |
ParameterError |
Indicates that a reader cannot handle a passed parameter. |
RST () |
Read or write a reStructuredText simple format table. |
Rdb () |
Read a tab-separated file with an extra line after the column definition line. |
SExtractor () |
Read a SExtractor file. |
StrType |
Indicates that a column consists of text data. |
Tab () |
Read a tab-separated file. |
TableOutputter |
Output the table as an astropy.table.Table object. |
WhitespaceSplitter |