Re: [DIYbio] Re: Standardized DIYbio report format? WAS: Endophyte isolation and first successful sequencing

# My paper's working title
Author(s): _Garvey, C., Glowbiotics Ltd._
Date: 14th August, 2013
Copyright: _Public Domain, as all scientific knowledge should be._

## Subtitle prior to abstract
This is all the abstract stuff, I fill this in as I please. If using the attribute
extension, this paragraph can be followed by a simple attribute tag as below, which
adds the "abstract" class to this paragraph in the rendered HTML, allowing it to
be styled differently from the rest of the document. This is, however, not essential,
as a script could be designed merely to treat the first paragraph of the document
differently.
{.abstract}

## Preface
This is a brief overview of the state-of-the-art prior to my research, providing
background on what is known and what isn't in the topic area.

It then lays out in a later paragraph what this research investigates, without
presenting too much on the outcomes except to broadly state what they were.

## Methods
This is a _concise_ and _explicit_ list of _all_ procedures, materials and methods,
if possible including the _ingredients and constituents of any commercial products
used_. It should contain as much information as a person reading the paper would
need to replicate the work, provided they have basic skills expected of someone
performing such research to begin with.

* This is a list item for one of my methods. It can contain multiple paragraphs,
provided they are indented to match the list.

* This is another list item for another method.

## Results
This is an area where all results are presented in detail. If there is a great deal
of data, then an explicit summary of the data should be presented here, and the
data presented in an _attached document_ in an _open format_. Online data in
webapps, proprietary or otherwise, is not suitable for publication, as clouds rain
and webapps crash. Data should be serialised to an open, offline format and presented
alongside the research data for immediate access by readers.

If data happens to be too long for a reasonably rendered results area but not larger
than a separate paper in itself, perhaps it can be attached to the document but
presented *after* the references after a page-break, making one, readable document
with all attached data; more portable.

Inline markdown image tags can be used with referential paths to embed images,
like so: ![Alt text](/path/to/img.jpg "Optional Title").

Or, for clarity, the alternative link syntax where links/embeds are defined as footnotes
and used inline by defined-id can be used. So, I can embed a data image:
![An image detailing some data][data_image_1]
..and define the image tag anywhere (it will not be rendered where it is defined,
so it's most clear and useful to define links and images in the section in which
they are first used).

[data_image_1]: /path/to/img2.png "Optional Title"

## Conclusions
Here, result data is interpreted according to best available knowledge in the area.
Assumptions or suspicions *may* be placed here if *clearly marked as such* and with
a clear explanation of the line of reasoning, allowing others to suggest or implement
experiments to test the assumptions in ensuing work and further the study of the topic
area.

## References
1. Garvey, C. "An ordered list of references", [_Proceedings of DIYbio_][POD] [Thread 01, August 2013](https://proceedingsofdiybio.com/2013/aug/01)
2. Bishop, B. "Why YAML is totally better than Markdown", [_DIYbio Enhancement Proposal 12_](http://gnusha.org/skdb/), August 2013

[POD]: https://proceedingsofdiybio "Proceedings of DIYbio Site"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iQIcBAEBCgAGBQJSC5f3AAoJEL0iNgSYi5CZCh4P/jPlL5SEG6e3YdEl9+Pe9wNj
Q6yUsG7yOFlhRL6jSvMbD8MrFqQeJjmVHHr4mqmA4cB8dtHrtUp2b5/vjNmAL1S4
nPJgH8/rV3aqf/+gLvG+lR7fv0bH32TEV/q1Xi/h+h6QM6uknL9McnawvfpuHLNa
zSSx5dlF7WjMLTeGUwqA+JpafeRbXgn5DRMmpFj5s0oM5HlmcQqtdFlirUmygjVx
Kvi4hHnYe2vVPfnHkW+2NEUEjrteLBxe9fS9/hqPd6MGi3ZnBfCE5N3iQmo9JncG
S+kRNFUW73mC1cXxd3lFHr9B3rvacRkCVi07Is0j/9YSUoHhAt985Gk+7gH/Wy9Z
rW0txxpnz0XMZpwNaXlqqbry6/mQcDs59drGPomaSAcvlSa3JB3K8WGGD3/qKUUJ
4HgF8GMNzXW+B6JBNpso45lWX4nrN0tPdyHWHibON5KUuLUjQKobFOLFy0evQBEU
98P5hXKWHMzCkxXPuxYsU6VnQFG9vKdoIhBoyETevZ62GK2P7CuSUyvjGcwH/iEk
cpj0fPCqkaCiW1KeVW7c2vBh1HlTYrnQxbqHmOSW+qdcfaWHtDeXCpjZGLThabx8
o6om61LBGpZf7mA0DyAkimliO7B8AN3B34bMH8Ygz+yNiRCVlyFYZgGb74orReYN
8IbITYrXqwAoWp+Yn7Ma
=/9as
-----END PGP SIGNATURE-----
To further illustrate my case for markdown, here's a template document
showing how a scientific paper could be formatted in Markdown for
rendering.

I make light use of the html-attributes extension that allows headers,
paragraphs or other items to be given class and id attributes in
rendered HTML: I don't know if this is supported in LaTeX processors
but if it is it's ideal for marking, say, the abstract for smaller font
and a centered full-page flow, whereas the rest of the document would
be styled perhaps according to the two-column layout favoured by many
journals.

Python Markdown also supports an extension where Metadata can be
defined at the very top of the document, but I don't think this is a
common extension with other markdown processors, so a merely
standardised and nicely script-readable format to the general document
and references would probably be best.

If we're coming up with our own standard for publishing and want to "do
it better", there's a lot that can be improved upon the older
publishing standards. Many of these were written to be unambiguous when
printed on dead-trees; I support this of course, but hyperlinks look
cleaner on a webpage or document. We could specify a more
machine-parsable (without sacrificing human readability) reference
format, perhaps using indented YAML or something, and have that render
into a nicely formatted dead-tree reference in LaTeX and a hyperlink
with machine-parseable attributes in HTML.

Perhaps it'd look like this; easy to read and write, machine parseable,
and presenting all needed information to render a classic or modern
reference format:
1. Author Garvey, C; Bishop, B | Title "Paper Title" | Publication
"Journal Name" | Issue "Unambiguous Textual Description of
Volume/Issue/Page as needed to traverse dead trees" | Link
https://link.to/article | Date YYYY/MM/DD

NB:
* The general format is as a bar-delimited list of S-Expressions; that
is, the first word of each bar-delimited section is the metadata
type. Some type are standardised, such as "author", but any
metadata may be added and will be ignored if no way to
render/serialise it is known. This bar-delimited S-expression system
is human readable, unambiguous (bars are barely used ever outside
computers), and very machine parseable.
* Using hyperlinks means a standard format for fetching the document
for machines and browsers is covered, so the "Issue" element is for
people manually seeking the data, perhaps after the link has lapsed
or the publishing house has collapsed and they're on the phone to a
librarian or navigating a new website. It has no standard structure
because journals and publishers don't, either; some have volumes and
no issues, some have issues and no volumes, some are entirely online
and have neither.
* The Date should be YYYY/MM/DD in order for it to be standard,
unambiguous, and trivially orderable between many journal articles.

To demonstrate how easy to parse this is, here's 7 lines (excluding
"#" comments) of trivial Python to do just that:
def parse_ref(ref_line):
output = {}
# Cut off the list index by splitting at the period mark and
# stripping whitespace from either end.
ref_line = ref_line.split(".",1)[0].strip()
# Split metadata sections into blocks.
for meta_section in ref_line.split("|"):
# Strip whitespace again and then split by the first tab/space/CR
meta_type, meta_data = meta_section.strip().split(None, 1)
output[meta_type] = meta_data
return output

Thoughts, again? :)
-Cathal

  • Digg
  • Del.icio.us
  • StumbleUpon
  • Reddit
  • RSS

0 comments:

Post a Comment