The Hub

The Community of Minds

  • The Hub
  • About TheCOM™
    • Testimonials
    • About The Founder: Bethany Huot
  • TheCOM Center for Educative Research™
    • Educative Research™
    • The BIOME Project
  • FAQ
  • The Whiteboard
  • Strategic Career Management (SCM)
    • SCM: Identify
    • SCM: Defining The Void
    • SCM: Commit
    • SCM: Community Perspectives
  • The Resources
    • Digital Identity Management
    • Networking & Science Communication (#SciComm)
    • Writing & Peer Review
    • Bioinformatics & Statistics
    • Methods & Technologies
    • Teaching & Learning (T&L)
      • T & L Communities
      • T & L Training Programs/Fellowships
      • T & L Career Path Prep
      • T & L Tools & Resources
    • Career Prep
    • Job Hunting
  • The Vault (Archive)
    • The File Cabinet
      • The Pub Club Files:
        • The News
        • The Pub Club
          • The Mission
          • The People
          • The Mug Club
            • The Coaster Club
          • The Python Group
          • The Publications
            • Favorite Pubs
            • Papers of Interest…
            • Scoop.it
        • 2017 Summer – Summaries & Docs
        • 2017 Spring – Summaries & Docs
        • 2016 Fall – Summaries & Docs
        • 2016 Summer – Summaries & Docs
        • 2016 Spring – Summaries & Docs
        • 2015 Fall – Summaries & Docs
        • 2015 Summer – Summaries & Docs
        • 2015 Spring – Summaries & Docs
        • 2014 Fall – Summaries & Docs

Big genomics data raising alarms

  • Fun with science (The COM)
  • RCR (The COM)
  • Science Says (The COM)

The alarming explosion of genome sequencing data was recently addressed in PLoS Biology (Big Data: Astronomical or Genomical), and touched on in Nature News (Genome researchers raise alarm over big data).  The authors compared sequencing data with three other big data generators: astronomy, YouTube, and Twitter. These demand massive computing resources for data acquisition (astronomy), storage (astronomy, YouTube), analysis (Twitter), or distribution (YouTube). However, sequencing data presents large demands for all of these.

But I think it is most interesting to look at some of the current statistics:

– Illumina has released their new HiSeq X series (can be obtained in series of five or ten sequencers): in 3 days, each can sequence 6 billion reads at paired end, 150 bp/read (i.e. 1.8 Tb, or ~15 human genomes at 30x), every 3 days! 3 days!!

– there are currently more than 3.6 petabases of raw data in NCBI’s SRA: ~32 000 microbial genomes, ~5 000 animal and plant genomes, and ~250 000 human genomes — but the current sequencing capacity is estimated to be 35 petabases/year

– the authors of the PLoS Biol paper show that since 2009, our sequencing capacity has doubled every 7 months; compare this with Illumina’s estimate of doubling every 12 months, or Moore’s law that of doubling every 18 months

– by 2025, the authors estimate we will need between 2-40 exabytes (100 M – 2 B human genomes, or 2-40 B Arabidopsis genomes) of data storage per year

Share this:

  • Tweet
  • Email
  • Share on Tumblr
big data data analysis genomics RCR statistics
July 21, 2015 Ian Major

Post navigation

Building the 21st century scientist → ← Can you see me now? Why & how to be visible in today’s job market

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Pages

  • The Hub
  • About TheCOM™
    • Testimonials
    • About The Founder: Bethany Huot
  • TheCOM Center for Educative Research™
    • Educative Research™
    • The BIOME Project
  • FAQ
  • The Whiteboard
  • Strategic Career Management (SCM)
    • SCM: Identify
    • SCM: Defining The Void
    • SCM: Commit
    • SCM: Community Perspectives
  • The Resources
    • Digital Identity Management
    • Networking & Science Communication (#SciComm)
    • Writing & Peer Review
    • Bioinformatics & Statistics
    • Methods & Technologies
    • Teaching & Learning (T&L)
      • T & L Communities
      • T & L Training Programs/Fellowships
      • T & L Career Path Prep
      • T & L Tools & Resources
    • Career Prep
    • Job Hunting
  • The Vault (Archive)
    • The File Cabinet
      • The Pub Club Files:
        • The News
        • The Pub Club
          • The Mission
          • The People
          • The Mug Club
            • The Coaster Club
          • The Python Group
          • The Publications
            • Favorite Pubs
            • Papers of Interest…
            • Scoop.it
        • 2017 Summer – Summaries & Docs
        • 2017 Spring – Summaries & Docs
        • 2016 Fall – Summaries & Docs
        • 2016 Summer – Summaries & Docs
        • 2016 Spring – Summaries & Docs
        • 2015 Fall – Summaries & Docs
        • 2015 Summer – Summaries & Docs
        • 2015 Spring – Summaries & Docs
        • 2014 Fall – Summaries & Docs
copyright 2021 Bethany Huot/TheCOM,LLC / Powered by WordPress | theme SG Double