Biology question

Wise Guy
Soldato
Joined
23 May 2009
Posts
5,748
How many MB does a sperm hold?

The human genome is supposed to be about 350 MB of data, so wouldn't a sperm contain all that data in DNA? According to this it only contains 1 bit (X or Y), so an entire ejaculation would only be 21MB. That doesn't sound right to me.

Now if you really can fit 350MB on a single sperm couldn't this somehow be harnessed for data storage? That's an incredible amount of data for such a small space.

http://www.utheguru.com/fun-science-how-many-megabytes-in-the-human-body

The number of Megabytes ‘exchanged’ during human reproduction

The human sperm can be one of three states - x, y or wasted.Each sperm cell in a human male is heterogametic, meaning it contains only one of two sex chromosomes (x or y) - incidentally, the female egg is homogametic - meaning that it only has an x chromosome.

This means the male ‘determines’ the sex of the child, which makes a mockery of Henry the 8th’s annulment of his marriage with Catherine of Aragon on account of the fact that she was ‘incapable of providing a male heir’.

Basically, sperm cells are like bits - they can (in most cases) be only one of two states - x or y, or in digital form, 0 or 1. So, it’s possible to express an ejaculate in megabytes (!?!) - Let’s try.

The average human ejaculate contains around 180 million sperm. So, that’s 180,000,000 bits. If we use google to convert 180 million bits to megabytes, we find that approximately 21.45 megabytes of ‘data’ is transferred during each act of human sexual reproduction in the form of gametes.
 
How many MB does a sperm hold?

The human genome is supposed to be about 350 MB of data, so wouldn't a sperm contain all that data in DNA? According to this it only contains 1 bit (X or Y), so an entire ejaculation would only be 21MB. That doesn't sound right to me.

Now if you really can fit 350MB on a single sperm couldn't this somehow be harnessed for data storage? That's an incredible amount of data for such a small space.

http://www.utheguru.com/fun-science-how-many-megabytes-in-the-human-body
You can't compare them, they are two different things.

A sperm holding X OR Y,

If anything sperm would hold half the data and get the other half from the egg...
 
I agree, you simply can't compare them. In it's most basic form the information stored in the human genome is not simply a matter of 'on' or 'off', 'one' or 'zero' and so could't be read by any computer we have.

Besides, 'X' or 'Y' isn't just 'one bit', the X chromosome contains in itself 1208 genes and the Y 104. This is where getting free fridge magnets from the Open University really comes into it's own ;)
 
I agree, you simply can't compare them. In it's most basic form the information stored in the human genome is not simply a matter of 'on' or 'off', 'one' or 'zero' and so could't be read by any computer we have.

Besides, 'X' or 'Y' isn't just 'one bit', the X chromosome contains in itself 1208 genes and the Y 104. This is where getting free fridge magnets from the Open University really comes into it's own ;)

So, what you're saying is that women are more complex? I get it now... :p
 
i find the expression of the human genome as computer data amounts to be illogical. Sure, to store that data on a PC might take that much memory, but that doesn't mean the genome can be used as memory - very different situations.
 
Now if you really can fit 350MB on a single sperm couldn't this somehow be harnessed for data storage? That's an incredible amount of data for such a small space.

So what you’re saying is that when asked “what the hell are you doing?” the next time I’m caught beating one off in the library, I can simply replay “saving my work” ?
 
Each DNA base has a possible total of four values (either adenosine, guanine, cytosine or thymine), that's the direct equivalent of two bits of data in a computer.

Though I agree it gets a little more complicated when you consider that the DNA is a triplet code with each three base sequence encoding for an amino acid, but there are a total of 64 amino acids the DNA could potentially encode for but in reality it encodes for only 20 and the other combinations are either junk DNA or act as control mechanisms.

But as for this question, I imagine that for each sperm the number would be slightly above half of the 350MB number (presuming that is correct), because while the sperm carries a haploid set of chromosomes (that is to say, a random assortment of half of the chromosomes of the father (23 chromosomes), most cells in the human body are diploid cells, which means they have two sets of chromosomes (46 chromosomes)). However, some extra DNA is being carried in each sperm cell, which while not unique to each one, is still present in the mitochondria of the sperm. However arguably that shouldn't count because only the DNA from the mother's mitochondria is ever passed to the child.

I'd argue the way they've answered the question is entirely incorrect as the sequence of sperms carrying X or Y chromosomes doesn't actually encode for anything, but the unique combination of the father's DNA that they each carry does, so that's what should be considered.

Yay this counts as human biology revision. :D
 
I've heard this before, and I agree that you cannot compare them. A sperm contains 23 chromosomes, which contain a whole load of base sequences of DNA, which are made up of amino acids, which are made up of carbon, nitrogen, hydrogen and oxygen, which are made up of protons neutrons and electrons. Now which part of those is the equivalent of one 'bit'?
 
Each DNA base has a possible total of four values (either adenosine, guanine, cytosine or thymine), that's the direct equivalent of two bits of data in a computer.

Though I agree it gets a little more complicated when you consider that the DNA is a triplet code with each three base sequence encoding for an amino acid, but there are a total of 64 amino acids the DNA could potentially encode for but in reality it encodes for only 20 and the other combinations are either junk DNA or act as control mechanisms.

But as for this question, I imagine that for each sperm the number would be slightly above half of the 350MB number (presuming that is correct), because while the sperm carries a haploid set of chromosomes (that is to say, a random assortment of half of the chromosomes of the father (23 chromosomes), most cells in the human body are diploid cells, which means they have two sets of chromosomes (46 chromosomes)). However, some extra DNA is being carried in each sperm cell, which while not unique to each one, is still present in the mitochondria of the sperm. However arguably that shouldn't count because only the DNA from the mother's mitochondria is ever passed to the child.

I'd argue the way they've answered the question is entirely incorrect as the sequence of sperms carrying X or Y chromosomes doesn't actually encode for anything, but the unique combination of the father's DNA that they each carry does, so that's what should be considered.

Yay this counts as human biology revision. :D

haha, I was going to say you sounded like your doing a level biology
 
That's significantly underestimating the amount of 'data' (to continue the abstraction, if we must) that DNA stores. On top of the base sequence, there's the whole field of epigenetics to be considered, as well as post-transcription modification of RNA etc etc.

As has been said above, to consider the human genome in terms of 'bits' (1 = AT, 0 = CG) is a nonsense anyway. If you wanted to download the human genome for looking at in a more useful way - on a computer screen - then it's about 765MB when gzipped (excluding non-chomosomal DNA). A mouse in comparison is 700MB. Just to 1) people in their place and 2) reinforce the point that to talk of DNA in bits is completely missing the point, a quick calculation would put the largest known genome (a plant - Paris japonica) at 37GB (that's still gzipped!).

But data storage in the way you're talking about (rapid read, rapid access, high fidelity etc) isn't really what DNA's all about.

// EDIT // And for those referring to 'junk DNA' ... please don't, you make me sad :(
 
Last edited:
Well I suppose the question is sort of answered by ANother.

The idea isn't as far fetched as it sounds though. Trust the Japanese to come up with it. They use bacteria instead of jizz though.

http://www.dailygalaxy.com/my_weblog/2009/11/artificial-dna-an-immortal-library-of-human-knowledge-.html

Professor Masaru Tomita and his team of researchers at Keio University, Japan, have developed artificial DNA with encoded information that can be added to the genome of common bacteria. The four characters used in genetic coding (A's, T's, G's and C's) work much like digital data. If coded in a particular way, different character combinations can represent specific letters and symbols which can then be translated to produce music, text, video and other content.
 
Professor Masaru Tomita and his team of researchers at Keio University, Japan, have developed artificial DNA with encoded information that can be added to the genome of common bacteria. The four characters used in genetic coding (A's, T's, G's and C's) work much like digital data. If coded in a particular way, different character combinations can represent specific letters and symbols which can then be translated to produce music, text, video and other content.
Well this is true in the sense that you can apply any kind of decoding abstraction over the top of a preset code. This is all that happens when the bit-byte data making up this webpage is converted into a character encoding for us to read here. Apply BIG5 instead of UTF-8 (or whatever) and you get different text.

Given a chosen encoding system, then yes, you can reverse engineer the primary code to produce a poem or a tune. That's not really anything fundamentally novel, though. In fact Craig Venter's team did something very similar by signing the DNA they used to create their 'artificial life' (don't get me started on that bull!).

With a reverse engineered code, yes, you can put it in bacteria as a plasmid, but that doesn't make it useful in a storage environment. Firstly, bacteria don't like having extraneous DNA floating about and can gain a fitness benefit by losing the plasmid. Secondly, background mutation in DNA (UV, cell replication etc etc) will cause alterations in the code. That might not be such an issue if an exchange mutation occurs (say C to T), but if a base is deleted or added, this will render everything past that point unreadable by your chosen decoding method (a frame shift mutation). Other points to take into account are the inherent errors in the polymerase chain reaction that would be used to create these plasmids in the first place.

At the end of the day, yes, DNA can be used to code 'data' that can be interpreted via an encoding, but it's not anything more than a proof-of-principal.
 
Lose the megabyte and bit analogy and use the proper terms for a start. Its megabases for a start and nucleic acids instead of bits.

DNA has nothing to do with computers so please don't confuse the two.
 
If each atom in your head could be used as a bit your head could store 5,184,119,800,105 terabytes of information. The total data storage capacity globally has been estimated as around 1,000,000 terabytes so you can rest happy in knowing that your head can contain over 5 million worlds worth of information inside of it.




we may need to cut off your head for this to work...
 
Back
Top Bottom