By using R language do the following: • Create data frame called Annotation with a column of gene names ("Gene_1", "Gene_2","Gene_3","Gene_4","Gene_5"), ensembl gene names ("Ens001", "Ens003", "Ens006", "Ens007", "Ens010"),pathway information ("Glycolysis", "TGFb", "Glycolysis", "TGFb", "Glycolysis") and gene lengths (100, 3000, 200,1000,1200).• Create data frame called Sample1 with ensembl gene names ("Ens001", "Ens003", "Ens006", "Ens010") andexpression (1000, 3000, 10000,5000)• Create data frame called Sample2 with ensembl gene names ("Ens001", "Ens003", "Ens006","Ens007", "Ens010") and expression (1500, 1500, 17000,500,10000)• Create a data frame containing only those gene names common to all data frames with all information fromAnnotation and the expression from Sample 1 and Sample 2.ensembl geneNames pathway geneLengths expression.x expression.y## 1 Ens881## 2 Ens883Ens886## 3## 4 Ens010###### 2 Ens003## 3 Ens006##4#### 1## 2## 3## 4• Add an extra two columns containing the length normalised expressions for Sample 1 and Sample 2ensembl geneNames pathway geneLengths expression.x expression.y##1 Ens001Gene 1 GlycolysisGene_2TGFbGene 3 GlycolysisEns010 Gene 5 GlycolysisSample1 Ine Sample2_1ne15.0000008.50000085.0000008.33333318.0000001.000000##Gene_1 GlycolysisGene_2TGFbGene 3 GlycolysisGene 5 Glycolysis58.0000004.166667100300028812001883000288120010003808188885888• Identify the total length of genes in Glycolysis pathway.## [1] 150015001500178881000018083808100005888• Identify the mean length normalised expression across Sample 1 and Sample2 for Ens006 genes## [1] 67.5• For all genes, identify the log2 fold change in length normalised expression from Sample 1 to Sample 2.Gene 1Gene 2Gene 3Gene 5## 8.5849625 -1.0080000 8.7655347 1.0800000150015001780010000

• Create data frame called Annotation with a column of gene names ("Gene_1", "Gene_2", "Gene_3","Gene_4","Gene_5"), ensembl gene names ("Ens001", "Ens003", "Ens006", "Ens007", "Ens010"), pathway information ("Glycolysis", "TGFb", "Glycolysis", "TGFb", "Glycolysis") and gene lengths (100, 3000, 200, 1000,1200). • Create data frame called Sample1 with ensembl gene names ("Ens001", "Ens003", "Ens006", "Ens010") and expression (1000, 3000, 10000,5000) Create data frame called Sample2 with ensembl gene names ("Ens001", "Ens003", "Ens006", "Ens007", "Ens010") and expression (1500, 1500, 17000,500,10000) • Create a data frame containing only those gene names common to all data frames with all information from Annotation and the expression from Sample 1 and Sample 2. ensembl geneNames pathway geneLengths expression.x expression.y 1 Ens001 Gene 1 Glycolysis Ens003 Gene 2 TGFb 2 3 Ens006 Gene 3 Glycolysis 4 Ens010 Gene 5 Glycolysis • Add an extra two columns containing the length normalised expressions for Sample 1 and Sample 2 ensembl geneNames pathway geneLengths expression.x expression.y 1 En 001 #2 Ens003 Ens006 Gene 1 Glycolysis Gene 2 TGFb Gene 3 Glycolysis EN5010 Gene 5 Glycolysis Sample1 Ine Sample2 Ine 3 4 10.000000 15.000000 1.000000 0.500000 50.000000 85.000000 4.166667 8.333333 100 3000 200 1200 100 3000 200 1200 1000 3000 10000 5000 Gene S 1.0000000 1000 3000 10000 5000 • Identify the total length of genes in Glycolysis pathway. [1] 1500 1500 1500 17000 10000 Identify the mean length normalised expression across Sample 1 and Sample2 for Ens006 genes [1] 67.5 1500 1500 17000 10000 For all genes, identify the log2 fold change in length normalised expression from Sample 1 to Sample 2. Gene 3 Gene 1 Gene 2 0.5849625 -1.0000000 0.7655347

Database System Concepts

7th Edition

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Chapter1: Introduction

Section: Chapter Questions

Problem 1PE

See similar textbooks

Similar questions

SEE MORE QUESTIONS

Recommended textbooks for you

Database System Concepts

Computer Science

ISBN:

9780078022159

Author:

Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:

McGraw-Hill Education

Starting Out with Python (4th Edition)

Computer Science

ISBN:

9780134444321

Author:

Tony Gaddis

Publisher:

PEARSON

Digital Fundamentals (11th Edition)

Computer Science

ISBN:

9780132737968

Author:

Thomas L. Floyd

Publisher:

PEARSON

Database System Concepts

Computer Science

ISBN:

9780078022159

Author:

Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:

McGraw-Hill Education

Starting Out with Python (4th Edition)

Computer Science

ISBN:

9780134444321

Author:

Tony Gaddis

Publisher:

PEARSON

Digital Fundamentals (11th Edition)

Computer Science

ISBN:

9780132737968

Author:

Thomas L. Floyd

Publisher:

PEARSON

C How to Program (8th Edition)

Computer Science

ISBN:

9780133976892

Author:

Paul J. Deitel, Harvey Deitel

Publisher:

PEARSON

Database Systems: Design, Implementation, & Manag…

Computer Science

ISBN:

9781337627900

Author:

Carlos Coronel, Steven Morris

Publisher:

Cengage Learning

Programmable Logic Controllers

Computer Science

ISBN:

9780073373843

Author:

Frank D. Petruzella

Publisher:

McGraw-Hill Education

SEE MORE TEXTBOOKS