Europe PMC

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Abstract 


Earth is expected to continue warming and the Red Sea is a model environment for understanding the effects of global warming on ocean microbiomes due to its unusually high temperature, salinity and solar irradiance. However, most microbial diversity analyses of the Red Sea have been limited to cultured representatives and single marker gene analyses, hence neglecting the substantial uncultured majority. Here, we report 136 microbial genomes (completion minus contamination is ≥50%) assembled from 45 metagenomes from eight stations spanning the Red Sea and taken from multiple depths between 10 to 500 m. Phylogenomic analysis showed that most of the retrieved genomes belong to seven different phyla of known marine microbes, but more than half representing currently uncultured species. The open-access data presented here is the largest number of Red Sea representative microbial genomes reported in a single study and will help facilitate future studies in understanding the physiology of these microorganisms and how they have adapted to the relatively harsh conditions of the Red Sea.

Free full text 


Logo of sdataLink to Publisher's site
Sci Data. 2016; 3: 160050.
Published online 2016 Jul 5. https://doi.org/10.1038/sdata.2016.50
PMCID: PMC4932879
PMID: 27377622

A catalogue of 136 microbial draft genomes from Red Sea metagenomes

Associated Data

Data Citations
Supplementary Materials

Abstract

Earth is expected to continue warming and the Red Sea is a model environment for understanding the effects of global warming on ocean microbiomes due to its unusually high temperature, salinity and solar irradiance. However, most microbial diversity analyses of the Red Sea have been limited to cultured representatives and single marker gene analyses, hence neglecting the substantial uncultured majority. Here, we report 136 microbial genomes (completion minus contamination is ≥50%) assembled from 45 metagenomes from eight stations spanning the Red Sea and taken from multiple depths between 10 to 500 m. Phylogenomic analysis showed that most of the retrieved genomes belong to seven different phyla of known marine microbes, but more than half representing currently uncultured species. The open-access data presented here is the largest number of Red Sea representative microbial genomes reported in a single study and will help facilitate future studies in understanding the physiology of these microorganisms and how they have adapted to the relatively harsh conditions of the Red Sea.

Subject terms: Water microbiology, Marine biology, Genome informatics

Background & Summary

The Red Sea is an ideal marine environment to study microbial adaptation to physical conditions atypical of global oceans: high temperature, high salinity, and high irradiance. In late summer 2011, we undertook the King Abdullah University of Science and Technology (KAUST) Red Sea Expedition (KRSE2011) in the eastern Red Sea in order to map its diversity along environmental gradients that occur with changes in latitude, longitude, and depth1. This time of year is not only when temperatures and evaporation (and hence salinity) are highest, but also when a foreign water mass called the Gulf of Aden Intermediate Water (GAIW) intrudes into the Red Sea1,2 (Fig. 1). The GAIW brings nutrient-rich water to the Red Sea, providing nitrogen, phosphorus, and other elements to this otherwise oligotrophic sea, and is likely to introduce important microbial diversity.

An external file that holds a picture, illustration, etc.
Object name is sdata201650-f1.jpg
Experimental workflow for this study.

The circles superimposed on the Red Sea 3D map shows the sampling points during the King Abdullah University of Science and Technology Red Sea Expedition 2011. The green lines represent the three Gulf of Aden Intermediate Water (GAIW) sampling points. The numbers within the circles represent the number of genomes recovered from each of the sample. Colors represent the high (dark red) to low (dark blue) water temperature. A total of 45 samples of 20 l each were collected and filtered through a series of filters. For this study, DNA extraction was performed on the small microbial fractions (between 0.1 to 1.2 μm). Extracted DNA was sequenced on the Illumina HiSeq 2,000 generating paired-end reads (2×93 bp). Reads from each metagenome were cleaned and assembled individually. Genomes were binned based on tetranucleotide and coverage-based method, refined and quality checked. All 136 genomes were annotated by IMG/ER and taxonomically assigned based on genome trees inferred from single-copy genes.

Insights into the taxonomic, evolutionary, and functional diversity of the Red Sea have largely been based on studies of pure cultures3–5 and single marker genes such as the 16S rRNA6,7, or internal transcribed spacer8. Recently, investigations of microbial ecology have steered towards whole genome-based culture-independent methods notably single-cell genomics and metagenomics9,10. Single-cell genomics is an exciting field that recovers complete and partial single cell genomes from complex environments, albeit the need of specialised equipment, high cost and relatively low throughput11–13. Metagenomics is paving the way forward by harnessing the recent wave of sequencing technology and bioinformatics advancements to recover genomes of individual populations or populations of closely related organisms14–16. Application of these methods has resulted in the recovery of numerous genomes of uncultivated microorganisms that have provided surprising insights into the diversity and function of microbial communities10,14,17–19.

During the KRSE2011, eight stations were sampled along a cruise track from south to north, capturing gradients in temperature, salinity, oxygen, and nutrients, including the unique GAIW water mass (Fig. 1 and Table 1 (available online only)). At each station, samples were collected from the surface to mesopelagic depths (10, 25, 50, 100, 200, and 500 m), except for stations 12 and 34, which had depths shallower than 500 m (Fig. 1 and Table 1 (available online only)), in order to capture a greater variation in environmental parameters and microbial diversity. Here, we successfully reconstructed 136 genomes from 45 individually assembled metagenomes (Figs 1 and and2,2, Tables 1 and and22 (available online only), Data Citation 1) by differential read coverage and tetranucleotide frequency methods. Of these, 43 were ‘near-complete’ with an estimated completion minus contamination of ≥90%, while the other 93 draft genomes had completion minus contamination of ≥50% (Table 2 (available online only)). To our knowledge, this is the largest number of microbial genomes from the Red Sea to be reported in a single study.

An external file that holds a picture, illustration, etc.
Object name is sdata201650-f2.jpg
Phylogenetic trees for the archaeal (green lines; top left) and bacterial (blue lines; bottom right) domains based on 122 and 120 single-copy marker genes, respectively.

The clades represented by the triangles are collapsed at the phylum (P) level except for phyla containing genomes from this study which are expanded at the class (C) level and highlighted in red. Certain phyla have genome representatives only at the phylum level (Thaumarchaeota, Marinimicrobia, Cyanobacteria, and Bdellovibrionaeota). Numbers in parentheses indicate the count of recovered genomes from a particular taxonomic level. Dashed lines indicate nodes for class level. Robustness of the tree is indicated by black circles (size of circles scaled from 80 to 100% bootstrap support values). Trees were inferred independently. The archaeal tree was rooted with the DPANN superphylum9 while the bacterial tree was ‘arbitrarily’ rooted with the phylum Chloroflexi42 but should be treated as unrooted.

Table 1

Characteristics of the 45 Red Sea metagenomic samples
Isolation sourceWater massDate and TimeAssembly size (Mbps)No. of scaffoldsLargest scaffold size (Mbps)N50Depth (m)BioProjectBioSampleNCBI accession (assembled)NCBI SRA accession (raw reads)
Red Sea water column Station 12Red Sea18/09/2011 10:2644.11295550.120207210PRJNA289734SAMN03860258LUMR00000000SRR2102994
Red Sea water column Station 12Red Sea 75.83461460.102250825PRJNA289734SAMN03860259LUMQ00000000SRR2102995
Red Sea water column Station 12GAIW 37.84231580.157272047PRJNA289734SAMN03860260LUMP00000000SRR2103006
Red Sea water column Station 22Red Sea19/09/2011 08:5347.49385080.060141510PRJNA289734SAMN03860261LUMO00000000SRR2103017
Red Sea water column Station 22Red Sea 40.79277180.072197325PRJNA289734SAMN03860262LUMN00000000SRR2103028
Red Sea water column Station 22Red Sea 41.68298460.090177550PRJNA289734SAMN03860263LUMM00000000SRR2103034
Red Sea water column Station 22Red Sea 41.80331900.1291457100PRJNA289734SAMN03860264LUML00000000SRR2103035
Red Sea water column Station 22Red Sea 47.47334090.1331876200PRJNA289734SAMN03860265LUMK00000000SRR2103036
Red Sea water column Station 22Red Sea 80.71451860.1793201500PRJNA289734SAMN03860266LUMJ00000000SRR2103037
Red Sea water column Station 34Red Sea20/09/2011 04:1077.82466380.221264610PRJNA289734SAMN03860267LUMI00000000SRR2103038
Red Sea water column Station 34Red Sea 42.51286850.221201825PRJNA289734SAMN03860268LUMH00000000SRR2102996
Red Sea water column Station 34GAIW 38.78275060.253184350PRJNA289734SAMN03860269LUMG00000000SRR2102997
Red Sea water column Station 34GAIW 39.04356150.0371167100PRJNA289734SAMN03860270LUMF00000000SRR2102998
Red Sea water column Station 34Red Sea 63.41429290.1251962200PRJNA289734SAMN03860271LUME00000000SRR2102999
Red Sea water column Station 34Red Sea 64.65370570.2632832258PRJNA289734SAMN03860272LUMD00000000SRR2103000
Red Sea water column Station 91Red Sea24/09/2011 20:5435.42261100.116171610PRJNA289734SAMN03860273LUMC00000000SRR2103001
Red Sea water column Station 91Red Sea 28.80212380.079178325PRJNA289734SAMN03860274LUMB00000000SRR2103002
Red Sea water column Station 91Red Sea 21.35181070.070129350PRJNA289734SAMN03860275LUMA00000000SRR2103003
Red Sea water column Station 91Red Sea 51.35459100.1411194100PRJNA289734SAMN03860276LULZ00000000SRR2103004
Red Sea water column Station 91Red Sea 49.04434840.1291239200PRJNA289734SAMN03860277LULY00000000SRR2103005
Red Sea water column Station 91Red Sea 68.61455430.1942011500PRJNA289734SAMN03860278LULX00000000SRR2103007
Red Sea water column Station 108Red Sea27/09/2011 21:0368.31392351.199288510PRJNA289734SAMN03860279LULW00000000SRR2103008
Red Sea water column Station 108Red Sea 59.60406900.160194825PRJNA289734SAMN03860280LULV00000000SRR2103009
Red Sea water column Station 108Red Sea 58.23490130.058133450PRJNA289734SAMN03860281LULU00000000SRR2103010
Red Sea water column Station 108Red Sea 36.51274610.1421626100PRJNA289734SAMN03860282LULT00000000SRR2103011
Red Sea water column Station 108Red Sea 58.51455360.0791553200PRJNA289734SAMN03860283LULS00000000SRR2103012
Red Sea water column Station 108Red Sea 63.24416040.1392136500PRJNA289734SAMN03860284LULR00000000SRR2103013
Red Sea water column Station 149Red Sea01/10/2011 05:0056.10315771.198298410PRJNA289734SAMN03860285LULQ00000000SRR2103014
Red Sea water column Station 149Red Sea 62.95349950.416317825PRJNA289734SAMN03860286LULP00000000SRR2103015
Red Sea water column Station 149Red Sea 79.82472550.314276350PRJNA289734SAMN03860287LULO00000000SRR2103016
Red Sea water column Station 149Red Sea 38.33250620.1702222100PRJNA289734SAMN03860288LULN00000000SRR2103018
Red Sea water column Station 149Red Sea 66.80520780.1051503200PRJNA289734SAMN03860289LULM00000000SRR2103019
Red Sea water column Station 149Red Sea 85.38546650.2892146500PRJNA289734SAMN03860290LULL00000000SRR2103020
Red Sea water column Station 169Red Sea03/10/2011 04:5599.91546361.199301910PRJNA289734SAMN03860291LULK00000000SRR2103021
Red Sea water column Station 169Red Sea 83.57452170.308323225PRJNA289734SAMN03860292LULJ00000000SRR2103022
Red Sea water column Station 169Red Sea 84.94535780.158230950PRJNA289734SAMN03860293LULI00000000SRR2103023
Red Sea water column Station 169Red Sea 73.95543730.1491685100PRJNA289734SAMN03860294LULH00000000SRR2103024
Red Sea water column Station 169Red Sea 73.59572580.1551530200PRJNA289734SAMN03860295LULG00000000SRR2103025
Red Sea water column Station 169Red Sea 79.59583080.4001655500PRJNA289734SAMN03860296LULF00000000SRR2103026
Red Sea water column Station 192Red Sea05/10/2011 10:5698.63570071.199278910PRJNA289734SAMN03860297LULE00000000SRR2103027
Red Sea water column Station 192Red Sea 58.15344830.321266625PRJNA289734SAMN03860298LUMS00000000SRR2103029
Red Sea water column Station 192Red Sea 87.61475631.358331550PRJNA289734SAMN03860299LUMT00000000SRR2103030
Red Sea water column Station 192Red Sea 50.63340150.6802014100PRJNA289734SAMN03860300LUMU00000000SRR2103031
Red Sea water column Station 192Red Sea 45.78303140.2952337200PRJNA289734SAMN03860301LUMV00000000SRR2103032
Red Sea water column Station 192Red Sea 73.78440690.2782528500PRJNA289734SAMN03860302LUMW00000000SRR2103033

Table 2

Characteristics of the 136 genomes reported in this study
Genome binsGenome size (Mbps)No. of scaffoldsIMG Gene countGC (%)Marker lineage for CheckMCompleteness (%)Contamination (%)Comp-Cont %Isolation sourceDepth (m)Latitude/LongtitudeBioProjectBioSampleNCBI accessionIMG genome ID
Acidimicrobiia bacterium REDSEA-S09_B72.02226188471.62k__Bacteria (UID1453)84.252.2382.02Red Sea water column Station 2250017.996 N 39.799 EPRJNA289734SAMN04534547LUMX000000002651870138
Acidimicrobiia bacterium REDSEA-S14_B42.10257165471.6k__Bacteria (UID1453)85.862.5683.3Red Sea water column Station 3420018.58 N 40.743 EPRJNA289734SAMN04534548LUMY000000002651870139
Acidimicrobiia bacterium REDSEA-S20_B61.76338206471.41k__Bacteria (UID1453)81.361.5979.77Red Sea water column Station 9120020.525 N 38.781 EPRJNA289734SAMN04534549LUMZ000000002651870140
Acidimicrobiia bacterium REDSEA-S21_B101.73328203771.37k__Bacteria (UID1453)64.310.4363.88Red Sea water column Station 9150020.525 N 38.781 EPRJNA289734SAMN04534550LUNA000000002651870141
Acidimicrobiia bacterium REDSEA-S33_B8N92.15262238071.46k__Bacteria (UID1453)80.485.5674.92Red Sea water column Station 14950023.604 N 37.054 EPRJNA289734SAMN04534551LUNB000000002651870142
Acinetobacter sp. REDSEA-S21_B142.58517308539.1f__Moraxellaceae (UID4680)71.544.3267.22Red Sea water column Station 9150020.525 N 38.781 EPRJNA289734SAMN04534552LUNC000000002651870143
Actinobacteria bacterium REDSEA-S36_B121.37255173462.57o__Actinomycetales (UID1663)60.63060.63Red Sea water column Station 1695025.772 N 36.116 EPRJNA289734SAMN04534553LUND000000002651870144
Aeromicrobium sp. REDSEA-S32_B73.52333379471.86o__Actinomycetales (UID1697)95.736.2389.5Red Sea water column Station 14920023.604 N 37.054 EPRJNA289734SAMN04534555LUNF000000002651870146
Aeromicrobium sp. REDSEA-S35_B13.5683360972.12o__Actinomycetales (UID1697)98.062.895.26Red Sea water column Station 1692525.772 N 36.116 EPRJNA289734SAMN04534556LUNG000000002651870147
Aeromicrobium sp. REDSEA-S38_B23.49104360872.12o__Actinomycetales (UID1697)98.531.996.63Red Sea water column Station 16920025.772 N 36.116 EPRJNA289734SAMN04534557LUNH000000002651870148
Aeromicrobium sp. REDSEA-S42_B43.4754353872.15o__Actinomycetales (UID1697)98.450.8697.59Red Sea water column Station 1925027.897 N 34.507 EPRJNA289734SAMN04534558LUNI000000002651870149
Aeromicrobium sp. REDSEA-S44_B13.4744356072.09o__Actinomycetales (UID1697)98.910.9597.96Red Sea water column Station 19220027.897 N 34.507 EPRJNA289734SAMN04534559LUNJ000000002651870150
Alteromonas macleodii str. REDSEA-S09_B24.34102319044.52c__Gammaproteobacteria (UID4761)98.830.5198.32Red Sea water column Station 2250017.996 N 39.799 EPRJNA289734SAMN04534560LUNK000000002651870151
Alteromonas macleodii str. REDSEA-S10_B92.55490123944.64c__Gammaproteobacteria (UID4761)59.710.5759.14Red Sea water column Station 341018.58 N 40.743 EPRJNA289734SAMN04534561LUNL000000002651870152
Alteromonas macleodii str. REDSEA-S12_B52.64514235543.98c__Gammaproteobacteria (UID4761)59.422.3957.03Red Sea water column Station 345018.58 N 40.743 EPRJNA289734SAMN04534562LUNM000000002651870153
Alteromonas macleodii str. REDSEA-S14_B113.11543213544.58c__Gammaproteobacteria (UID4761)73.751.2472.51Red Sea water column Station 3420018.58 N 40.743 EPRJNA289734SAMN04534563LUNN000000002651870154
Alteromonas macleodii str. REDSEA-S15_B113.79457322744.59c__Gammaproteobacteria (UID4761)89.91.2588.65Red Sea water column Station 3425818.58 N 40.743 EPRJNA289734SAMN04534564LUNO000000002651870155
Candidatus Marinimicrobia (SAR406 cluster) bacterium REDSEA-S14_B61.43234285554.35k__Bacteria (UID2495)72.350.1872.17Red Sea water column Station 3420018.58 N 40.743 EPRJNA289734SAMN04534578LUOC000000002651870223
Candidatus Marinimicrobia (SAR406 cluster) bacterium REDSEA-S15_B101.65233404454.08k__Bacteria (UID2495)71.030.170.93Red Sea water column Station 3425818.58 N 40.743 EPRJNA289734SAMN04534579LUOD000000002651870224
Candidatus Marinimicrobia (SAR406 cluster) bacterium REDSEA-S15_B131.48201347841.84k__Bacteria (UID2495)71.593.368.29Red Sea water column Station 3425818.58 N 40.743 EPRJNA289734SAMN04534580LUOE000000002651870225
Candidatus Marinimicrobia (SAR406 cluster) bacterium REDSEA-S27_B1N121.36280164351.44k__Bacteria (UID2495)60.192.257.99Red Sea water column Station 10850022.046 N 37.929 EPRJNA289734SAMN04534581LUOF000000002651870226
Candidatus Marinimicrobia (SAR406 cluster) bacterium REDSEA-S33_B131.27264146254.33k__Bacteria (UID2495)56.031.6554.38Red Sea water column Station 14950023.604 N 37.054 EPRJNA289734SAMN04534582LUOG000000002651870227
Candidatus Marinimicrobia (SAR406 cluster) bacterium REDSEA-S38_B131.15217131341.56k__Bacteria (UID2495)58.341.257.14Red Sea water column Station 16920025.772 N 36.116 EPRJNA289734SAMN04534583LUOH000000002651870228
Candidatus Marinimicrobia (SAR406 cluster) bacterium REDSEA-S39_B111.09225120354.56k__Bacteria (UID2495)52.970.152.87Red Sea water column Station 16950025.772 N 36.116 EPRJNA289734SAMN04534584LUOI000000002651870229
Candidatus Marinimicrobia (SAR406 cluster) bacterium REDSEA-S39_B70.241522938.13k__Bacteria (UID2495)78.56.8971.61Red Sea water column Station 16950025.772 N 36.116 EPRJNA289734SAMN04534585LUOJ000000002651870230
Candidatus Thioglobus (SUP05 cluster) sp. REDSEA-S03_B11.5897226438.39p__Proteobacteria (UID3880)87.923.8684.06Red Sea water column Station 124717.662 N 40.905 EPRJNA289734SAMN04534667LURM000000002651870213
Candidatus Thioglobus (SUP05 cluster) sp. REDSEA-S12_B11.6177154038.35p__Proteobacteria (UID3880)88.911.3287.59Red Sea water column Station 345018.58 N 40.743 EPRJNA289734SAMN04534668LURN000000002651870214
Candidatus Thioglobus (SUP05 cluster) sp. REDSEA-S14_B121.83262233639.85p__Proteobacteria (UID3880)67.297.8659.43Red Sea water column Station 3420018.58 N 40.743 EPRJNA289734SAMN04534669LURO000000002651870215
Erythrobacter sp. REDSEA-S22_B42.6910289763.73o__Sphingomonadales (UID3310)99.930.4399.5Red Sea water column Station 1081022.046 N 37.929 EPRJNA289734SAMN04534565LUNP000000002651870156
Erythrobacter sp. REDSEA-S28_B22.698289463.71o__Sphingomonadales (UID3310)99.90.4399.47Red Sea water column Station 1491023.604 N 37.054 EPRJNA289734SAMN04534566LUNQ000000002654587888
Erythrobacter sp. REDSEA-S34_B32.697289363.73o__Sphingomonadales (UID3310)99.930.4399.5Red Sea water column Station 1691025.772 N 36.116 EPRJNA289734SAMN04534567LUNR000000002654587889
Erythrobacter sp. REDSEA-S36_B62.9399303563.57o__Sphingomonadales (UID3310)97.092.5394.56Red Sea water column Station 1695025.772 N 36.116 EPRJNA289734SAMN04534568LUNS000000002654587890
Erythrobacter sp. REDSEA-S37_B32.7994296863.54o__Sphingomonadales (UID3310)97.59196.59Red Sea water column Station 16910025.772 N 36.116 EPRJNA289734SAMN04534569LUNT000000002654587891
Erythrobacter sp. REDSEA-S40_B12.697289763.72o__Sphingomonadales (UID3310)99.930.4399.5Red Sea water column Station 1921027.897 N 34.507 EPRJNA289734SAMN04534570LUNU000000002654587892
Erythrobacter sp. REDSEA-S41_B12.9323293363.59o__Sphingomonadales (UID3310)99.540.6398.91Red Sea water column Station 1922527.897 N 34.507 EPRJNA289734SAMN04534571LUNV000000002654587893
Erythrobacter sp. REDSEA-S42_B52.6812290563.72o__Sphingomonadales (UID3310)99.830.4399.4Red Sea water column Station 1925027.897 N 34.507 EPRJNA289734SAMN04534572LUNW000000002654587894
Erythrobacter sp. REDSEA-S43_B22.8718291863.64o__Sphingomonadales (UID3310)99.840.9498.9Red Sea water column Station 19210027.897 N 34.507 EPRJNA289734SAMN04534573LUNX000000002651870157
Erythrobacter sp. REDSEA-S45_B72.89285313963.54o__Sphingomonadales (UID3310)89.241.2587.99Red Sea water column Station 19250027.897 N 34.507 EPRJNA289734SAMN04534574LUNY000000002651870158
Idiomarina sp. REDSEA-S21_B42.2660245147.2c__Gammaproteobacteria (UID4761)95.290.4294.87Red Sea water column Station 9150020.525 N 38.781 EPRJNA289734SAMN04534575LUNZ000000002651870159
Idiomarina sp. REDSEA-S27_B42.3857255047.27c__Gammaproteobacteria (UID4761)98.140.7497.4Red Sea water column Station 10850022.046 N 37.929 EPRJNA289734SAMN04534576LUOA000000002651870160
Marine group II euryarchaeote REDSEA-S03_B61.32210140045.47p__Euryarchaeota (UID3)65.012.362.71Red Sea water column Station 124717.662 N 40.905 EPRJNA289734SAMN04534670LURP000000002651870292
Marine group II euryarchaeote REDSEA-S10_B21.2329369850.45p__Euryarchaeota (UID3)76.131.274.93Red Sea water column Station 341018.58 N 40.743 EPRJNA289734SAMN04534671LURQ000000002651870293
Marine group II euryarchaeote REDSEA-S11_B3N41.3035219750.98p__Euryarchaeota (UID3)81.96081.96Red Sea water column Station 342518.58 N 40.743 EPRJNA289734SAMN04534672LURR000000002651870294
Marine group II euryarchaeote REDSEA-S19_B7N81.17155179652.04p__Euryarchaeota (UID3)70.40.869.6Red Sea water column Station 9110020.525 N 38.781 EPRJNA289734SAMN04534673LURS000000002651870295
Marine group II euryarchaeote REDSEA-S25_B4N51.10167127252.14p__Euryarchaeota (UID3)58.430.2758.16Red Sea water column Station 10810022.046 N 37.929 EPRJNA289734SAMN04534674LURT000000002651870296
Marine group II euryarchaeote REDSEA-S29_B8N91.10131124450.19p__Euryarchaeota (UID3)59.47059.47Red Sea water column Station 1492523.604 N 37.054 EPRJNA289734SAMN04534675LURU000000002651870297
Marine group II euryarchaeote REDSEA-S30_B121.27137137436.9p__Euryarchaeota (UID3)67.790.866.99Red Sea water column Station 1495023.604 N 37.054 EPRJNA289734SAMN04534676LURV000000002651870298
Marine group II euryarchaeote REDSEA-S37_B2N91.21615549.08p__Euryarchaeota (UID3)76.470.0676.41Red Sea water column Station 16910025.772 N 36.116 EPRJNA289734SAMN04534677LURW000000002651870299
Marine group II euryarchaeote REDSEA-S40_B11N131.24118130050.09p__Euryarchaeota (UID3)71.96071.96Red Sea water column Station 1921027.897 N 34.507 EPRJNA289734SAMN04534678LURX000000002651870300
Marine group II euryarchaeote REDSEA-S41_B61.13122121649.97p__Euryarchaeota (UID3)71.351.9269.43Red Sea water column Station 1922527.897 N 34.507 EPRJNA289734SAMN04534679LURY000000002651870301
Marine group II euryarchaeote REDSEA-S42_B71.1943120249.72p__Euryarchaeota (UID3)75.73075.73Red Sea water column Station 1925027.897 N 34.507 EPRJNA289734SAMN04534680LURZ000000002651870302
Marine group II euryarchaeote REDSEA-S43_B81.10105121350.28p__Euryarchaeota (UID3)72.13072.13Red Sea water column Station 19210027.897 N 34.507 EPRJNA289734SAMN04534681LUSA000000002651870303
Marinobacter sp. REDSEA-S15_B162.89527287357.64c__Gammaproteobacteria (UID4444)72.810.8971.92Red Sea water column Station 3425818.58 N 40.743 EPRJNA289734SAMN04534586LUOK000000002651870218
Marinobacter sp. REDSEA-S21_B2N34.37262434357.01c__Gammaproteobacteria (UID4444)88.752.5986.16Red Sea water column Station 9150020.525 N 38.781 EPRJNA289734SAMN04534587LUOL000000002651870219
Marinobacter sp. REDSEA-S27_B103.27500378157.24k__Bacteria (UID203)70070Red Sea water column Station 10850022.046 N 37.929 EPRJNA289734SAMN04534588LUOM000000002651870220
Maritimibacter sp. REDSEA-S28_B54.25122463264.32f__Rhodobacteraceae (UID3356)97.930.6897.25Red Sea water column Station 1491023.604 N 37.054 EPRJNA289734SAMN04534590LUOO000000002651870216
Maritimibacter sp. REDSEA-S40_B33.8728451164.35f__Rhodobacteraceae (UID3356)99.70.6899.02Red Sea water column Station 1921027.897 N 34.507 EPRJNA289734SAMN04534591LUOP000000002651870217
Moraxellaceae bacterium REDSEA-S29_B62.23154232741.94c__Gammaproteobacteria (UID4201)91.59091.59Red Sea water column Station 1492523.604 N 37.054 EPRJNA289734SAMN04534592LUOQ000000002651870134
Moraxellaceae bacterium REDSEA-S32_B12.3880227242.07c__Gammaproteobacteria (UID4201)98.13098.13Red Sea water column Station 14920023.604 N 37.054 EPRJNA289734SAMN04534593LUOR000000002654587887
Moraxellaceae bacterium REDSEA-S35_B91.60306190041.91c__Gammaproteobacteria (UID4201)71.10.7770.33Red Sea water column Station 1692525.772 N 36.116 EPRJNA289734SAMN04534594LUOS000000002651870137
Moraxellaceae bacterium REDSEA-S38_B32.41100229341.98c__Gammaproteobacteria (UID4201)97.33097.33Red Sea water column Station 16920025.772 N 36.116 EPRJNA289734SAMN04534595LUOT000000002651870132
Moraxellaceae bacterium REDSEA-S42_B151.88270212441.99c__Gammaproteobacteria (UID4201)78.540.5777.97Red Sea water column Station 1925027.897 N 34.507 EPRJNA289734SAMN04534596LUOU000000002651870135
Moraxellaceae bacterium REDSEA-S44_B22.3595232442.04c__Gammaproteobacteria (UID4201)97.131.1595.98Red Sea water column Station 19220027.897 N 34.507 EPRJNA289734SAMN04534597LUOV000000002651870133
Moraxellaceae bacterium REDSEA-S45_B111.71287193842.04c__Gammaproteobacteria (UID4201)78.092.0476.05Red Sea water column Station 19250027.897 N 34.507 EPRJNA289734SAMN04534598LUOW000000002651870136
Nitrosopelagicus sp. REDSEA-S08_B10.5892182335.43k__Archaea (UID2)58.95.8353.07Red Sea water column Station 2220017.996 N 39.799 EPRJNA289734SAMN04534599LUOX000000002651870205
Nitrosopelagicus sp. REDSEA-S19_B12N31.51307122636.28k__Archaea (UID2)85.745.3480.4Red Sea water column Station 9110020.525 N 38.781 EPRJNA289734SAMN04534600LUOY000000002651870206
Nitrosopelagicus sp. REDSEA-S25_B30.89116129434.1k__Archaea (UID2)74.761.9472.82Red Sea water column Station 10810022.046 N 37.929 EPRJNA289734SAMN04534601LUOZ000000002651870207
Nitrosopelagicus sp. REDSEA-S27_B13N21.51273200937.26k__Archaea (UID2)69.844.8564.99Red Sea water column Station 10850022.046 N 37.929 EPRJNA289734SAMN04534602LUPA000000002651870208
Nitrosopelagicus sp. REDSEA-S31_B21.0186138734.05k__Archaea (UID2)92.642.0290.62Red Sea water column Station 14910023.604 N 37.054 EPRJNA289734SAMN04534603LUPB000000002651870209
Nitrosopelagicus sp. REDSEA-S32_B20.455560433.9k__Archaea (UID2)53.41.9451.46Red Sea water column Station 14920023.604 N 37.054 EPRJNA289734SAMN04534604LUPC000000002651870210
Nitrosopelagicus sp. REDSEA-S37_B60.87104123334.05k__Archaea (UID2)89.811.9487.87Red Sea water column Station 16910025.772 N 36.116 EPRJNA289734SAMN04534605LUPD000000002651870211
Nitrosopelagicus sp. REDSEA-S43_B10.98115140333.94k__Archaea (UID2)84.954.5380.42Red Sea water column Station 19210027.897 N 34.507 EPRJNA289734SAMN04534606LUPE000000002651870212
Nocardioides sp. REDSEA-S22_B23.7694405071.84o__Actinomycetales (UID1697)94.431.3893.05Red Sea water column Station 1081022.046 N 37.929 EPRJNA289734SAMN04534607LUPF000000002651870231
Nocardioides sp. REDSEA-S25_B92.25485264371.44o__Actinomycetales (UID1697)57.781.1756.61Red Sea water column Station 10810022.046 N 37.929 EPRJNA289734SAMN04534608LUPG000000002651870232
Nocardioides sp. REDSEA-S28_B43.90313419471.71o__Actinomycetales (UID1697)94.913.5491.37Red Sea water column Station 1491023.604 N 37.054 EPRJNA289734SAMN04534609LUPH000000002651870233
Nocardioides sp. REDSEA-S30_B43.68104385771.94o__Actinomycetales (UID1697)96.890.8696.03Red Sea water column Station 1495023.604 N 37.054 EPRJNA289734SAMN04534610LUPI000000002651870234
Nocardioides sp. REDSEA-S31_B43.44387384171.68o__Actinomycetales (UID1697)83.541.2182.33Red Sea water column Station 14910023.604 N 37.054 EPRJNA289734SAMN04534611LUPJ000000002651870235
Nocardioides sp. REDSEA-S33_B33.5661379672.09o__Actinomycetales (UID1697)98.19098.19Red Sea water column Station 14950023.604 N 37.054 EPRJNA289734SAMN04534612LUPK000000002651870236
Nocardioides sp. REDSEA-S34_B54.17291442771.75o__Actinomycetales (UID1697)94.134.1390Red Sea water column Station 1691025.772 N 36.116 EPRJNA289734SAMN04534613LUPL000000002651870237
Nocardioides sp. REDSEA-S36_B102.34439274471.08o__Actinomycetales (UID1697)59.331.2658.07Red Sea water column Station 1695025.772 N 36.116 EPRJNA289734SAMN04534614LUPM000000002651870238
Nocardioides sp. REDSEA-S37_B122.85503327871.42o__Actinomycetales (UID1697)77.850.677.25Red Sea water column Station 16910025.772 N 36.116 EPRJNA289734SAMN04534615LUPN000000002651870239
Nocardioides sp. REDSEA-S39_B23.4856379672.16o__Actinomycetales (UID1697)97.670.0597.62Red Sea water column Station 16950025.772 N 36.116 EPRJNA289734SAMN04534616LUPO000000002651870240
Nocardioides sp. REDSEA-S40_B43.70116396771.95o__Actinomycetales (UID1697)91.191.1190.08Red Sea water column Station 1921027.897 N 34.507 EPRJNA289734SAMN04534617LUPP000000002651870241
Nocardioides sp. REDSEA-S43_B33.68164394871.9o__Actinomycetales (UID1697)96.070.3595.72Red Sea water column Station 19210027.897 N 34.507 EPRJNA289734SAMN04534618LUPQ000000002651870242
Prochlorococcus sp. REDSEA-S17_B11.07152204130.99p__Cyanobacteria (UID2143)63.124.6558.47Red Sea water column Station 912520.525 N 38.781 EPRJNA289734SAMN04534620LUPS000000002651870162
Prochlorococcus sp. REDSEA-S22_B11.01113133831.11p__Cyanobacteria (UID2143)60.427.3853.04Red Sea water column Station 1081022.046 N 37.929 EPRJNA289734SAMN04534621LUPT000000002651870163
Prochlorococcus sp. REDSEA-S23_B11.06123138130.87p__Cyanobacteria (UID2143)61.617.254.41Red Sea water column Station 1082522.046 N 37.929 EPRJNA289734SAMN04534622LUPU000000002651870164
Prochlorococcus sp. REDSEA-S28_B10.93101118131.35p__Cyanobacteria (UID2143)55.194.7150.48Red Sea water column Station 1491023.604 N 37.054 EPRJNA289734SAMN04534623LUPV000000002651870165
Rhodobacteraceae bacterium REDSEA-S02_B32.07113219539.74f__Rhodobacteraceae (UID3340)80.461.0479.42Red Sea water column Station 122517.662 N 40.905 EPRJNA289734SAMN04534625LUPX000000002651870273
Rhodobacteraceae bacterium REDSEA-S03_B42.03202138039.69f__Rhodobacteraceae (UID3340)77.84.8672.94Red Sea water column Station 124717.662 N 40.905 EPRJNA289734SAMN04534626LUPY000000002651870274
Rhodobacteraceae bacterium REDSEA-S11_B61.89192290039.63k__Bacteria (UID203)77.438.6268.81Red Sea water column Station 342518.58 N 40.743 EPRJNA289734SAMN04534628LUQA000000002651870277
Rhodobacteraceae bacterium REDSEA-S29_B101.79357226240.1f__Rhodobacteraceae (UID3340)57.513.9953.52Red Sea water column Station 1492523.604 N 37.054 EPRJNA289734SAMN04534629LUQB000000002651870278
Rhodobacteraceae bacterium REDSEA-S34_B62.41111260640.44f__Rhodobacteraceae (UID3340)89.511.5787.94Red Sea water column Station 1691025.772 N 36.116 EPRJNA289734SAMN04534630LUQC000000002651870280
SAR116 cluster alphaproteobacterium REDSEA-S02_B121.51247221563.03c__Alphaproteobacteria (UID3305)74.630.4474.19Red Sea water column Station 122517.662 N 40.905 EPRJNA289734SAMN04534631LUQD000000002654587886
SAR116 cluster alphaproteobacterium REDSEA-S10_B10N81.58247222262.96c__Alphaproteobacteria (UID3305)78.7078.7Red Sea water column Station 341018.58 N 40.743 EPRJNA289734SAMN04534632LUQE000000002651870131
SAR324 cluster deltaproteobacterium REDSEA-S05_B41.75357206046.78k__Bacteria (UID3187)54.110.9453.17Red Sea water column Station 222517.996 N 39.799 EPRJNA289734SAMN04534633LUQF000000002654587902
SAR324 cluster deltaproteobacterium REDSEA-S06_B41.7337387347.26k__Bacteria (UID3187)54.220.0554.17Red Sea water column Station 225017.996 N 39.799 EPRJNA289734SAMN04534634LUQG000000002651870251
SAR324 cluster deltaproteobacterium REDSEA-S08_B72.12328150443.26k__Bacteria (UID2495)55.094.0951Red Sea water column Station 2220017.996 N 39.799 EPRJNA289734SAMN04534635LUQH000000002651870252
SAR324 cluster deltaproteobacterium REDSEA-S09_B33.3583182342.85k__Bacteria (UID3187)92.1092.1Red Sea water column Station 2250017.996 N 39.799 EPRJNA289734SAMN04534636LUQI000000002651870253
SAR324 cluster deltaproteobacterium REDSEA-S10_B53.49290296247.12k__Bacteria (UID3187)90.84090.84Red Sea water column Station 341018.58 N 40.743 EPRJNA289734SAMN04270322LNZD000000002651870254
SAR324 cluster deltaproteobacterium REDSEA-S11_B72.54416174347.48k__Bacteria (UID3187)73.123.7869.34Red Sea water column Station 342518.58 N 40.743 EPRJNA289734SAMN04534638LUQJ000000002651870255
SAR324 cluster deltaproteobacterium REDSEA-S14_B101.95455342642.39k__Bacteria (UID3187)623.1258.88Red Sea water column Station 3420018.58 N 40.743 EPRJNA289734SAMN04534639LUQK000000002651870256
SAR324 cluster deltaproteobacterium REDSEA-S15_B62.79191147142.4k__Bacteria (UID3187)86.181.7384.45Red Sea water column Station 3425818.58 N 40.743 EPRJNA289734SAMN04534640LUQL000000002651870257
SAR324 cluster deltaproteobacterium REDSEA-S21_B53.05128306142.86k__Bacteria (UID3187)89.510.2289.29Red Sea water column Station 9150020.525 N 38.781 EPRJNA289734SAMN04534641LUQM000000002651870258
SAR324 cluster deltaproteobacterium REDSEA-S26_B72.26415257342.53k__Bacteria (UID3187)69.561.0568.51Red Sea water column Station 10820022.046 N 37.929 EPRJNA289734SAMN04534642LUQN000000002651870259
SAR324 cluster deltaproteobacterium REDSEA-S27_B33.3080313442.85k__Bacteria (UID3187)92.1092.1Red Sea water column Station 10850022.046 N 37.929 EPRJNA289734SAMN04534643LUQO000000002651870260
SAR324 cluster deltaproteobacterium REDSEA-S33_B43.1294306942.82k__Bacteria (UID3187)92.1092.1Red Sea water column Station 14950023.604 N 37.054 EPRJNA289734SAMN04534644LUQP000000002651870261
SAR324 cluster deltaproteobacterium REDSEA-S36_B131.37323164046.89k__Bacteria (UID3187)51.761.6850.08Red Sea water column Station 1695025.772 N 36.116 EPRJNA289734SAMN04534645LUQQ000000002651870262
SAR324 cluster deltaproteobacterium REDSEA-S39_B53.04266305743k__Bacteria (UID3187)88.882.686.28Red Sea water column Station 16950025.772 N 36.116 EPRJNA289734SAMN04534646LUQR000000002651870263
SAR324 cluster deltaproteobacterium REDSEA-S45_B33.1789303842.89k__Bacteria (UID3187)92.1092.1Red Sea water column Station 19250027.897 N 34.507 EPRJNA289734SAMN04534647LUQS000000002651870264
SAR86 cluster gammaproteobacterium REDSEA-S08_B31.56177232936.99c__Gammaproteobacteria (UID4443)61.924.1657.76Red Sea water column Station 2220017.996 N 39.799 EPRJNA289734SAMN04534648LUQT000000002651870265
SAR86 cluster gammaproteobacterium REDSEA-S09_B41.67107153437.05c__Gammaproteobacteria (UID4443)68.132.4965.64Red Sea water column Station 2250017.996 N 39.799 EPRJNA289734SAMN04534649LUQU000000002651870266
SAR86 cluster gammaproteobacterium REDSEA-S20_B12N41.70364204338.32c__Gammaproteobacteria (UID4201)63.978.9655.01Red Sea water column Station 9120020.525 N 38.781 EPRJNA289734SAMN04534650LUQV000000002651870267
SAR86 cluster gammaproteobacterium REDSEA-S21_B71.52187178137.04c__Gammaproteobacteria (UID4201)75.212.8772.34Red Sea water column Station 9150020.525 N 38.781 EPRJNA289734SAMN04534651LUQW000000002651870268
SAR86 cluster gammaproteobacterium REDSEA-S45_B61.57123178137k__Bacteria (UID203)83.076.5876.49Red Sea water column Station 19250027.897 N 34.507 EPRJNA289734SAMN04534654LUQZ000000002651870271
Sphingopyxis sp. REDSEA-S22_B53.06103360665.03o__Sphingomonadales (UID3310)98.030.7397.3Red Sea water column Station 1081022.046 N 37.929 EPRJNA289734SAMN04534655LURA000000002651870196
Sphingopyxis sp. REDSEA-S23_B63.24118356065.08o__Sphingomonadales (UID3310)96.020.3495.68Red Sea water column Station 1082522.046 N 37.929 EPRJNA289734SAMN04534656LURB000000002651870197
Sphingopyxis sp. REDSEA-S24_B71.75439215665.3o__Sphingomonadales (UID3310)52.051.4750.58Red Sea water column Station 1085022.046 N 37.929 EPRJNA289734SAMN04534657LURC000000002651870198
Sphingopyxis sp. REDSEA-S29_B33.4643354065.19o__Sphingomonadales (UID3310)98.640.6897.96Red Sea water column Station 1492523.604 N 37.054 EPRJNA289734SAMN04534659LURE000000002651870200
Sphingopyxis sp. REDSEA-S34_B101.75399217865.03o__Sphingomonadales (UID3310)50.930.7650.17Red Sea water column Station 1691025.772 N 36.116 EPRJNA289734SAMN04534660LURF000000002651870201
Sphingopyxis sp. REDSEA-S38_B162.14337248564.84o__Sphingomonadales (UID3310)68.273.7264.55Red Sea water column Station 16920025.772 N 36.116 EPRJNA289734SAMN04534661LURG000000002651870202
Sphingopyxis sp. REDSEA-S40_B63.4573355865.18o__Sphingomonadales (UID3310)97.420.5196.91Red Sea water column Station 1921027.897 N 34.507 EPRJNA289734SAMN04534662LURH000000002651870203
Sphingopyxis sp. REDSEA-S42_B32.9111345565.26o__Sphingomonadales (UID3310)99.640.3499.3Red Sea water column Station 1925027.897 N 34.507 EPRJNA289734SAMN04534663LURI000000002651870204
Synechococcus sp. REDSEA-S01_B11.8078221662.76p__Cyanobacteria (UID2143)95.920.8295.1Red Sea water column Station 121017.662 N 40.905 EPRJNA289734SAMN04534664LURJ000000002651870193
Synechococcus sp. REDSEA-S02_B41.7881172462.76p__Cyanobacteria (UID2143)95.020.2794.75Red Sea water column Station 122517.662 N 40.905 EPRJNA289734SAMN04534665LURK000000002651870194
Unclassified gammaproteobacterium REDSEA-S03_B51.17129141138.75p__Proteobacteria (UID3880)67.341.2266.12Red Sea water column Station 124717.662 N 40.905 EPRJNA289734SAMN04534682LUSB000000002651870282
Unclassified gammaproteobacterium REDSEA-S08_B81.22235279851.01p__Proteobacteria (UID3882)55.590.6154.98Red Sea water column Station 2220017.996 N 39.799 EPRJNA289734SAMN04534683LUSC000000002651870283
Unclassified gammaproteobacterium REDSEA-S09_B132.28371401951.81p__Proteobacteria (UID3880)73.031.2271.81Red Sea water column Station 2250017.996 N 39.799 EPRJNA289734SAMN04534684LUSD000000002651870284
Unclassified gammaproteobacterium REDSEA-S12_B41.30146307338.79p__Proteobacteria (UID3880)71.754.2767.48Red Sea water column Station 345018.58 N 40.743 EPRJNA289734SAMN04534685LUSE000000002651870285
Unclassified gammaproteobacterium REDSEA-S14_B72.48348186052.09p__Proteobacteria (UID3880)83.91.3782.53Red Sea water column Station 3420018.58 N 40.743 EPRJNA289734SAMN04534686LUSF000000002651870286
Unclassified gammaproteobacterium REDSEA-S15_B122.90309167752.01p__Proteobacteria (UID3880)89.741.9387.81Red Sea water column Station 3425818.58 N 40.743 EPRJNA289734SAMN04534687LUSG000000002651870287
Unclassified gammaproteobacterium REDSEA-S21_B82.72374304852.02p__Proteobacteria (UID3880)89.652.4487.21Red Sea water column Station 9150020.525 N 38.781 EPRJNA289734SAMN04534688LUSH000000002651870288
Unclassified gammaproteobacterium REDSEA-S26_B101.41345179952.17p__Proteobacteria (UID3882)54.21.8852.32Red Sea water column Station 10820022.046 N 37.929 EPRJNA289734SAMN04534689LUSI000000002651870289
Unclassified gammaproteobacterium REDSEA-S27_B141.43327174252.05p__Proteobacteria (UID3880)57.132.4454.69Red Sea water column Station 10850022.046 N 37.929 EPRJNA289734SAMN04534690LUSJ000000002654587903
Unclassified gammaproteobacterium REDSEA-S33_B151.53372185352.31p__Proteobacteria (UID3882)561.0254.98Red Sea water column Station 14950023.604 N 37.054 EPRJNA289734SAMN04534691LUSK000000002651870290
Unclassified gammaproteobacterium REDSEA-S45_B92.31340262651.86p__Proteobacteria (UID3880)70.432.0368.4Red Sea water column Station 19250027.897 N 34.507 EPRJNA289734SAMN04534692LUSL000000002651870291

Phylogenomic analysis based on sets of single-copy marker genes universal to either the bacterial or archaeal domain showed that the 136 genomes encompassed seven phyla across these domains: Thaumarchaeota, Euryarchaeota, Actinobacteria, Cyanobacteria, Bdellovibrionaeota, Proteobacteria, and Marinimicrobia (Fig. 2 and Table 2 (available online only)). As expected, most of the recovered genomes were affiliated with known marine microorganisms such as phototrophic Prochlorococcus20,21 and Synechococcus22,23; representative of clades first discovered in the Sargasso Sea (SAR86, SAR116, SAR324 and SAR406)24–26; common marine bacteria in tropical biomes such as Alteromonas macleodii27; an ammonia oxidizing thaumarchaeon from the genus Nitrosopelagicus28; euryarchaeotal Marine Group II organisms reported to be abundant in surface waters29; members of the Alpha- and Gamma-proteobacteria such as Aeromicrobium, Erythrobacter, Maritimibacter, Idiomarina, Marinobacter, Candidatus Thioglobus (SUP05 cluster) and several unclassified Gammaproteobacteria, consistent with the high relative abundance of these two groups in the recent Tara Oceans survey30. Additionally, actinobacterial Acidiimicrobia and Nocardioides genomes thought to be responsible for secondary metabolite production in marine ecosystems31 were recovered from the metagenomes. An important strength of this dataset is the recovery of multiple, closely-related genomes from different stations or depths in the Red Sea (Data Citation 2). When complemented with physicochemical data1, genome plasticity between these organisms to confer fitness under varying conditions can be investigated in future studies.

To allow easy access to the genomes, all 136 genomes were functionally annotated and deposited into the National Centre for Biotechnology Information (NCBI) and Integrated Microbial Genomes (IMG) databases32. The wealth of metagenomic and genomic data described here greatly expands the repertoire of microbial genomic information from the Red Sea which might help to better understand the effects of global warming to ocean microbiomes. These datasets will also strengthen studies to better understand the drivers of marine nutrient cycling, help approaches for bioprospecting for novel thermo- and halo-philic enzymes, and allow for a better understanding of microbial adaptation strategies against high temperature, salinity and solar irradiance.

Methods

Metagenomic sequencing and assembly

Seawater samples were collected from eight stations and from different depths (10, 25, 50, 100, 200, and 500 m; locations are shown in Fig. 1) during summer as part of KRSE2011 (ref. 1). Genomic DNA was extracted from the 0.1–1.2 μm size fraction using an established phenol-chloroform extraction protocol1,33. Paired-end libraries (2×100 bp) were prepared using Nextera DNA Library Prep Kit (Illumina) and sequenced on a HiSeq 2000 (Illumina). Reads were quality checked and trimmed using PRINSEQ v0.20.4 (ref. 34) generating read lengths of ~93 bp and a total of ~10 million reads per sample with median insert sizes ranging from 183–366 bp1 (Data Citation 1). Trimmed metagenome reads were individually assembled (Table 1 (available online only)) using IDBA-UD v1.1.1 (ref. 35) using the ‘--pre-correction’ option. To obtain coverage profile of contigs from each metagenomic assembly, the trimmed reads were mapped back to contigs using BWA v0.7.12 (ref. 36) with the bwa-mem algorithm.

Genome binning, refinement, and annotation

For each metagenome, genome bins were recovered based on tetranucleotide frequencies and read coverage using MetaBAT v0.26.1 (ref. 37) with default parameters. The completeness and contamination of the bins were assessed using CheckM v1.0.3 (ref. 38) using the lineage-specific workflow (Table 2 (available online only)). Bins were further refined using the CheckM ‘merge’ and ‘outliers’ commands which merge bins with complementary sets of marker genes to improve completeness and remove contigs from bins which appear to be outliers relative to reference GC and tetranucleotide distributions in order to reduce contamination38. The FinishM v0.0.7 (https://github.com/wwood/finishm) ‘roundup’ workflow which comprise of ‘wander’ and ‘gapfill’ modes was used to scaffold contigs together and fill gaps within individual bins. The ‘wander’ mode uses a de Bruijn graph (kmer length of 51 bp and coverage cutoff of 5) to determine contig ends which are connected while the ‘gapfill’ mode align the reads to regions of ambiguous nucleotides and replaces them with the appropriate nucleotides. Genome bins that passed the quality filter of completion minus contamination of ≥50% were submitted to IMG/ER32 for gene calling and functional annotation.

Genome tree construction

The archaeal and bacterial genome trees (Fig. 2) were inferred from the concatenation of 122 and 120 proteins, respectively, identified as being present in ≥90% of the genomes in their respective domains and, when present, single-copy in ≥95% of genomes (Supplementary Tables 1 and 2). These marker genes were aligned using HMMER v3.1b1 (ref. 39) and the tree inference from the concatenated alignment with FastTree v2.1.7 (ref. 40) under the WAG+GAMMA models (Data Citation 2). Support values were determined using 100 non-parametric bootstrap replicates41. The archaeal tree was rooted with the DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanohaloarchaeota, and Nanoarchaeota) superphylum in concordance with a recent large-scale phylogenomic study9 while the bacterial tree was ‘arbitrarily’ rooted with the phylum Chloroflexi42 but should be treated as unrooted. The trees were visualized in ARB43, annotated by iTOL44 and edited in Illustrator CC 2014 (Adobe).

Code availability

All versions of third-party software and scripts used in this study are described and referenced accordingly in the Methods sub-sections for ease of access and reproducibility.

Data Records

The raw Illumina sequencing paired-end reads (Table 1 (available online only)), 45 assembled metagenome sequences (Table 1 (available online only)) and 136 assembled genome sequences (Table 2 (available online only)), generated from the KAUST Red Sea Expedition 2011, are available from NCBI databases (Data Citation 1). The genome trees and associated fasta amino acid alignment files are available from Figshare (Data Citation 2).

Technical Validation

To validate the completeness and contamination of the genomes, we accessed the number of marker genes present in all bacterial and archaeal genomes using CheckM38. The genomes were also manually cleaned from vector contamination by comparing against the UniVec core database (ftp://ftp.ncbi.nlm.nih.gov/pub/UniVec/).

Usage Notes

The annotated genome assemblies can be downloaded and accessed via the Integrated Microbial Genomes (IMG) system (https://img.jgi.doe.gov/cgi-bin/m/main.cgi). The IMG genome IDs are provided in Table 2 (available online only).

Additional Information

How to cite this article: Haroon, M. F. et al. A catalogue of 136 microbial draft genomes from Red Sea metagenomes. Sci. Data 3:160050 10.1038/sdata.2016.50 (2016).

Supplementary Material

Supplementary Tables:

Acknowledgments

We acknowledge the people who were involved in the KAUST Red Sea Expedition 2011 and those that helped to generate the data, include, but are not limited to, those named here: Matt Cahill, Mamoon Rashid, Vinu Manikandan, David Ngugi and Ahmed Shibl. This work was supported by King Abdullah University of Science and Technology (KAUST), Saudi Basic Industries Corporation (SABIC) fellowship to L.R.T., and SABIC presidential chair to U.S.

Footnotes

The authors declare no competing financial interests.

Data Citations

References

  • Thompson L. R. et al. Metagenomic covariation along densely sampled environmental gradients in the Red Sea. bioRxiv 10.1101/055012 (2016). [Europe PMC free article] [Abstract] [Google Scholar]
  • Churchill J. H., Bower A. S., McCorkle D. C. & Abualnaja Y. The transport of nutrient-rich Indian Ocean water through the Red Sea and into coastal reef systems. Journal of Marine Research 72, 165–181 (2014). [Google Scholar]
  • Sagar S. et al. Cytotoxic and apoptotic evaluations of marine bacteria isolated from brine-seawater interface of the Red Sea. BMC Complementary and Alternative Medicine 13, 1–8 (2013). [Europe PMC free article] [Abstract] [Google Scholar]
  • Jimenez-Infante F. et al. Genomic differentiation among two strains of the PS1 clade isolated from geographically separated marine habitats. FEMS microbiology ecology 89, 181–197 (2014). [Abstract] [Google Scholar]
  • Zhang G., Haroon M. F., Zhang R., Hikmawan T. & Stingl U. Draft Genome Sequence of Pseudoalteromonas sp. Strain XI10 Isolated from the Brine-Seawater Interface of Erba Deep in the Red Sea. Genome Announcements 4, e00109–16 (2016). [Europe PMC free article] [Abstract] [Google Scholar]
  • Fuller N. J. et al. Clade-specific 16S ribosomal DNA oligonucleotides reveal the predominance of a single marine Synechococcus clade throughout a stratified water column in the Red Sea. Applied and Environmental Microbiology 69, 2430–2443 (2003). [Europe PMC free article] [Abstract] [Google Scholar]
  • Qian P.-Y. et al. Vertical stratification of microbial communities in the Red Sea revealed by 16S rDNA pyrosequencing. The ISME journal 5, 507–518 (2011). [Europe PMC free article] [Abstract] [Google Scholar]
  • Ngugi D. K. & Stingl U. Combined analyses of the ITS loci and the corresponding 16S rRNA genes reveal high micro-and macrodiversity of SAR11 populations in the Red Sea. PLoS ONE 7, e50274 (2012). [Europe PMC free article] [Abstract] [Google Scholar]
  • Rinke C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013). [Abstract] [Google Scholar]
  • Brown C. T. et al. Unusual biology across a group comprising more than 15% of domain Bacteria. Nature 523, 208–211 (2015). [Abstract] [Google Scholar]
  • Grötzinger S. W. et al. Mining a database of single amplified genomes from Red Sea brine pool extremophiles—improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA). Frontiers in Microbiology 5, 134 (2014). [Europe PMC free article] [Abstract] [Google Scholar]
  • Clingenpeel S., Clum A., Schwientek P., Rinke C. & Woyke T. Reconstructing each cell’s genome within complex microbial communities - dream or reality? Frontiers in Microbiology 5 (2015). [Europe PMC free article] [Abstract] [Google Scholar]
  • Gawad C., Koh W. & Quake S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016). [Abstract] [Google Scholar]
  • Albertsen M. et al. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nature Biotechnology 31, 533–538 (2013). [Abstract] [Google Scholar]
  • Nielsen H. B. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat. Biotech. 32, 822–828 (2014). [Abstract] [Google Scholar]
  • Sangwan N., Xia F. & Gilbert J. A. Recovering complete and draft population genomes from metagenome datasets. Microbiome 4, 1–11 (2016). [Europe PMC free article] [Abstract] [Google Scholar]
  • Haroon M. F. et al. Anaerobic oxidation of methane coupled to nitrate reduction in a novel archaeal lineage. Nature 500, 567–570 (2013). [Abstract] [Google Scholar]
  • Soo R. M. et al. An expanded genomic representation of the phylum Cyanobacteria. Genome biology and evolution 6, 1031–1045 (2014). [Europe PMC free article] [Abstract] [Google Scholar]
  • Evans P. N. et al. Methane metabolism in the archaeal phylum Bathyarchaeota revealed by genome-centric metagenomics. Science 350, 434–438 (2015). [Abstract] [Google Scholar]
  • Moore L. R., Rocap G. & Chisholm S. W. Physiology and molecular phylogeny of coexisting Prochlorococcus ecotypes. Nature 393, 464–467 (1998). [Abstract] [Google Scholar]
  • Partensky F., Hess W. & Vaulot D. Prochlorococcus, a marine photosynthetic prokaryote of global significance. Microbiology and molecular biology reviews 63, 106–127 (1999). [Europe PMC free article] [Abstract] [Google Scholar]
  • Moore L. R., Goericke R. & Chisholm S. W. Comparative physiology of Synechococcus and Prochlorococcus: influence of light and temperature on growth, pigments, fluorescence and absorptive properties. Marine ecology progress series. Oldendorf 116, 259–275 (1995). [Google Scholar]
  • Palenik B. et al. The genome of a motile marine Synechococcus. Nature 424, 1037–1042 (2003). [Abstract] [Google Scholar]
  • Giovannoni S. J., Britschgi T. B., Moyer C. L. & Field K. G. Genetic diversity in Sargasso Sea bacterioplankton. Nature 345, 60–63 (1990). [Abstract] [Google Scholar]
  • Britschgi T. B. & Giovannoni S. J. Phylogenetic analysis of a natural marine bacterioplankton population by rRNA gene cloning and sequencing. Applied and Environmental Microbiology 57, 1707–1713 (1991). [Europe PMC free article] [Abstract] [Google Scholar]
  • Haroon M. F., Thompson L. R. & Stingl U. Draft genome sequence of uncultured SAR324 bacterium lautmerah10, binned from a Red Sea metagenome. Genome Announcements 4, e01711–e01715 (2016). [Europe PMC free article] [Abstract] [Google Scholar]
  • Ivars-Martinez E. et al. Comparative genomics of two ecotypes of the marine planktonic copiotroph Alteromonas macleodii suggests alternative lifestyles associated with different kinds of particulate organic matter. The ISME journal 2, 1194–1212 (2008). [Abstract] [Google Scholar]
  • Santoro A. E. et al. Genomic and proteomic characterization of ‘Candidatus Nitrosopelagicus brevis’: An ammonia-oxidizing archaeon from the open ocean. Proceedings of the National Academy of Sciences 112, 1173–1178 (2015). [Europe PMC free article] [Abstract] [Google Scholar]
  • DeLong E. F. Archaea in coastal marine environments. Proceedings of the National Academy of Sciences 89, 5685–5689 (1992). [Europe PMC free article] [Abstract] [Google Scholar]
  • Sunagawa S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015). [Abstract] [Google Scholar]
  • Bull A. T. & Stach J. E. Marine actinobacteria: new opportunities for natural product search and discovery. Trends in microbiology 15, 491–499 (2007). [Abstract] [Google Scholar]
  • Markowitz V. M. et al. IMG: the integrated microbial genomes database and comparative analysis system. Nucleic Acids Research 40, D115–D122 (2012). [Europe PMC free article] [Abstract] [Google Scholar]
  • Rusch D. et al. The Sorcerer II Global Ocean Sampling expedition: Northwest Atlantic through eastern tropical Pacific. PLoS Biology 5, e77 (2007). [Europe PMC free article] [Abstract] [Google Scholar]
  • Schmieder R. & Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011). [Europe PMC free article] [Abstract] [Google Scholar]
  • Peng Y., Leung H. C., Yiu S.-M. & Chin F. Y. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28, 1420–1428 (2012). [Abstract] [Google Scholar]
  • Li H. & Durbin R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010). [Europe PMC free article] [Abstract] [Google Scholar]
  • Kang D. D., Froula J., Egan R. & Wang Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 (2015). [Europe PMC free article] [Abstract] [Google Scholar]
  • Parks D. H., Imelfort M., Skennerton C. T., Hugenholtz P. & Tyson G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Research 25, 1043–1055 (2015). [Europe PMC free article] [Abstract] [Google Scholar]
  • Eddy S. R. Accelerated Profile HMM Searches. PLoS Computational Biology 7, e1002195 (2011). [Europe PMC free article] [Abstract] [Google Scholar]
  • Price M. N., Dehal P. S. & Arkin A. P. FastTree 2—Approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010). [Europe PMC free article] [Abstract] [Google Scholar]
  • Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791 (1985). [Abstract] [Google Scholar]
  • Dagan T., Roettger M., Bryant D. & Martin W. Genome Networks Root the Tree of Life between Prokaryotic Domains. Genome Biology and Evolution 2, 379–392 (2010). [Europe PMC free article] [Abstract] [Google Scholar]
  • Ludwig W. et al. ARB: a software environment for sequence data. Nucleic Acids Research 32, 1363–1371 (2004). [Europe PMC free article] [Abstract] [Google Scholar]
  • Letunic I. & Bork P. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Research 39, W475–W478 (2011). [Europe PMC free article] [Abstract] [Google Scholar]

Articles from Scientific Data are provided here courtesy of Nature Publishing Group

Citations & impact 


Impact metrics

Jump to Citations
Jump to Data

Citations of article over time

Alternative metrics

Altmetric item for https://www.altmetric.com/details/9245750
Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/9245750

Article citations


Go to all (35) article citations

Data 


Data behind the article

This data has been text mined from the article, or deposited into data resources.