Human Genome project

 Human Genome project:

As the recombinant DNA and DNA  Sequencing technologies improved in the 1970s and 1980s ,Scientists began discussing the possibility of sequencing all 3.2 billion nucleotide pairs in the human genome. 

No  two individuals are similar ( except monozygote twins) because  they differ in their genetic makeup. Differences in genetic make_ up are due to differences in nucleotide sequences  of their DNAs. Advances in genetic engineering techniques  made it possible to isolate and clone DNA pieces  and determine  nucleotide sequences of these fragment. Therefore , in 1990, U.S. Department of Energy  and National  Institute of Health  embarked  on the project of sequencing  human genome called HGP or Human Genome  Project. Welcome  Trust ( UK) joined the project as a major  partner. Later on Japan, France, Germany , China and some other countries also joined it.

The magnitude of the project can be imagined  that if the cost of sequencing a bp is 3 US dollars,  sequencing  of 3× 10⁹ bp would be a 9 billion US dollars. If the data is to be books, with each book having  1000 pages and each page with 1000 letters , some 3300 books will be required . Here bioinformatics  data basing and other high speed computational  devices  have helped in analysis, storage and retrieval  of information. 

Goals : HGP has set the following  goals:

1): Determine the sequence  3 billion  base pairs in the human genome. 

2):  Identify all the approximately  20,000 __ 25,000  genes present in human genome. 

3):  Determine the functions  of all the genes. 

4):  Identify the various  genes  that cause genetic  disorders.

5): Determine genetic proneness and immunity  to various  disorders. 

6): Store the information  in databases.

7): Improve tools for data analysis. 

8): Find out possibilities  of transfer of technology  developed  during HGP to industry. 
9) The project may result  in many ethical,  legal and social issues  ( ELSI), which must be addressed. 

The project  may slated to be completed  in 2023. On February  12, 2001, a formal announcement  about the completion of the project was made. However , announcement of sequencing  of individual  chromosomes came in May 2006 with the completion of assigned  nucleotide sequences to chromosome I.

These discussions led to the launch of the Human Genome Project in 1990.The initial goals of the Human Genome project were.

1) To map all the human genes,

2) To construct a detailed physical map of the entire human genome, and.

3) To determine the nucleotide sequence of all 24 human chromosomes by the year 2005.

General Features of the Human Genome:

1) The entire human genome contains about 3.2 billion base pairs of DNAs.

2) The base_ pair composition of the DNA varies across regions of the human genome. 

(i) On average, about 41 percent of the DNA consists of G.C base pairs

However, some regions are G.C rich and others are G:C poor:

The Human Genome Project, which operated from 1990 to 2003, provided researchers with basic information about the genetic content of the human organism, opening new avenues of discovery in fields such as cancer research. 

Salient Features  of Human Genome:

1): Human genome has 3.1647 billion ( or 3164.7  million) nucleotide base pairs.

2): The average  gene size is 3000 base pairs. The largest  gene is that of Duchenne  Muscular Dystrophy on X_ chromosome. It has 2.4 million  ( 2400 kilo) base pairs. ß _ global  and insulin  genes  are less than 10 kilobases.

3): The human genome consists  of about 30,000 genes .Previously it was estimated  to contain 80,000 to 1,40,000 genes. Human gene count is around the same as that of the  mouse.Nice tenth of genes  are identical to that of the mouse. We have more  than twice as many genes  as fruitfly  ( Drosophila melanogaster) and six times more genes  than in bacterium Escherichia  coli. Lily has 18 times more DNA than a human.

4): Chromosome I has 2968 genes  while Y__ chromosome  has 231 genes . They are the  maximum  and minimum  genes for the human  chromosomes.

5): The function  of over 50% of discovered  genes  is unknown. 

6):  Less than 2% of  the genome represents  structural genes that code for proteins. 

7): 99.9% of the nucleotide  bases are exactly  similar  in all human beings. Only 0.1% of human genome  represents the variability  observed  in human beings. 

8): At about 1.4 million  locations occur single nucleotide differences called SNPs ( snips) or  single nucleotide  polymorphism.They have the potential  to help find chromosomal locations for disease associated  sequences and tracing  history.

9): Repeated sequences make up a large portion of human genome.

10): Repetitive  sequences are nucleotide sequences  that are repeated  many times, sometimes  hundred to thousand times. They have no direct coding function  but provide information  as to chromosome structure,  dynamics  and evolution. 

11): Approximately 1 million  copies of short 5__ 8 base pair repeated sequences are clustered around centromeres and near the ends of chromosomes. They represent Junk DNA.

Genome India Project:

Taking inspiration from the Human Genome Project, the Department of Biotechnology ( DBT) initiated the ambitious " Genome India Project " ( GIP) on 3rd January 2020.

i) The GIP aims to collect 10,000 genetic samples from citizens across India, to build a reference genome. 

ii) Whole _ genome Sequencing and subsequent data analysis of the genetic data of these 10,000 individuals would be carried out.

iii) This would aid our understanding of the nature of diseases affecting the Indian population,  and then ultimately support the development of predictive diagnostic markers.

iii) It would also open new vistas for advancing next _ generation personalized medicine in the country paving the way for predicting health and disease outcome.

v) The initiative would also support the development of targeted preventive care, as it has the potential to help identify those population groups which are more susceptible to various risk factors for certain diseases. 

■ For instance, if a region shows a tendency towards a specific disease, customized interventions can be made in the region, accordingly, leading to more effective treatment overall. 

vi) This project is led by the center for Brain Research at Bengaluru _ based Indian Institute of Science, which acts as the central coordinator between a collaboration of 20 leading institutions, each collecting samples and conducting it's own research. 

vii) This initiatives reflects India's progress in gene therapies and precision medicine, and its movement towards emerging next _ generation medicine while yields the possibilities for greater customization, safety, and earlier detection. 

viii) This initiative would help lay the foundation of personalized healthcare for a very large group of persons on the planet.

Human Microbiome initiatives of select endogemous populations of India:

Health and disease outcome are determined by interactions between the genome and the environment. An important component of the " environment " in the context of human health is the human microbiota.

● The " Human Microbiome initiative of select  endogemous populations of India" aims at comprehensive characterization of human _ associated microbes in carefully selected endogemous population groups with divers dietary habits including key tribal populations which are not much influenced by modern lifestyle. 

● The study in investigating the influence of diet, lifestyle, geography, and age on gut microbiome using targeted metagenomic and whole metagenomic appropt to find the association between microbial enterotype and three distinct Ayurvedic Prakriti types.

Earth BioGenome project:

The Earth Bio_ Genome Project is a project aiming at analysing and sequencing genomes and building a new basis for biology to drive solution for biodiversity preservation and human society sustainability. 

■ The Earth Bio_ Genome Project ( EBP) is a world wide group of scientists who plan to sequence, classify and characterize the genomes on Earth over the Course often years.

☆ It is a global catalogue of life on the plant.

☆ In three phases, it shapes to sequence 1.5 million species. 

The EBP project will assist in the creation of a precise genetic sequence as well as the discovery of evolutionary relationships between the species, orders, and families that will make up the Digital Library of life.

Genome Mapping 

The technique of locating  genes on Chromosomes is known as gene mapping. Today, sequencing a genome and employing computer algorithms to evaluate the sequence to find genes.Locations is the most effective method for mapping genes. 

● There are two general types of genome mapping called genetic mapping and physical mapping. 

● Both types of genome mapping guide scientists towards the location of a gene ( or section of DNA ) on a chromosome ? however, they rely on very different information. 

■ Genetic mapping looks at how genetic information is shuffled between chromosome or between different regions in the same chromosome during meiosis? ( a type of cell division) . A process called recombination or 'crossing over'.

■ Physical mapping looks at the physical distance between known DNA sequence ( including genes) by working out the number of base pairs? ( A_ T, C_ G) between them.

● Significance of Genome Mapping can offer form evidence that a disease transmitted from parent to child is linked to one or more genes.

☆ Mapping also provides clues about which Chromosomes contains the gene and precisely where the gene lies on that  Chromosomes.


There are two types of approaches for analysing  the genome .

(i) Identify  all the genes  that are expressed  as RNA_ expressed  sequence  tags or ESTs. 

(ii)  Sequencing  the whole genome ( both coding and noncoding  regions) and later assigning  the different  regions with functions  sequence  annotation .HGP followed the second methodology. 

   DNA   is isolated band broken randomly into fragments bas DNA  is a very  long polymer, and there are technical limitation in sequencing every long pieces  of DNA. They are inserted into specialized  vectors like BAC ( Bacterial  artificial chromosomes) and YAC ( yeast artificial  chromosomes). The fragments  are cloned for amplification in suitable  hosts like bacteria and yeast. PCR ( polymerase  chain reaction) can also for amplification in suitable  hosts  like bacteria and yeast. The fragments  are sequenced  as annotated DNA sequences ( an offshoot of methodology  development  by the only double  Nobal Laureate, Friedrick Sanger who is also credited  for developing  method for determination  of amino acids sequences in proteins). The sequences  were then arranged  on the basis of some overlapping  regions. It necessitated the generation of overlapping  fragments  for sequencing. 

   Computer based programmes were used to allign the sequences. The sequences were then annotated and assigned  to each chromosome. Chromosome I was last to be sequenced  in May 2006.With  the help of polymorphism in microsatellite and restriction  endonuclease  recognition sites, the genetic   and physical  maps of the genome have also been prepared. 

Applications and Future Challenges:

1): Disorders: More than 1200 genes  are responsible for common human diseases. 

2): Cancers:Efforts are in  progress to determine  genes  that will change cancerous cells to normal.

3): Interactions: It will be possible  to study how various  genes  and  proteins  work together  in an interconnected  network. 

4): Study of Tissues:All the genes  or transcripts  tissue, organ or tumor can be analysed to know the cause of effect  produced in it.

5): Nonhuman Organisms: Information about natural capabilities  of nonhuman organisms can be used in meeting  challenges in health care, agriculture, energy production on  environmental remediation. For this a number of modal organisms have been sequenced, e.g., bacteria, yeast, Coenorhabditis elegans ( free living nonpathogenic nematod), Drosophila  ( fruitfly), Rice Arabidopsis etc.

Stomach               Pancreas

RNA                    Nucleic Acids


Popular posts from this blog



Nucleic Acids