Introduction to R Programming PYQ 2022 Solution

Last updated on May 16, 2023

Table of Contents

Question 1 (a): What value will be stored in the variable “X”?
X ← vector(”complex”, 3)

The variable “X” will be initialized as a vector of complex numbers with 3 elements. However, the actual values of those elements will depend on how they are initialized.

Question 1 (b): Write an R statement to extract the rows from a data frame “df” that does not have missing values.

The R statement would be:

df_complete <- df[complete.cases(df), ]

The complete.cases() function returns a logical vector indicating whether each row in the data frame has no missing values. The [ operator is used to subset the data frame, keeping only the rows that are returned by complete.cases().

Question 1 (c): Write the output for statements 1 and 2 in the following R script.

y <- c(2, 1, 5, 7, 8, 3, 2, 4, 5)
length(y) <- 4
print(y) #statement 1
length(y) <- 6
print(y) #statement 2

Output of Statement 1:

2 1 5 7

Output of Statement 2:

2 1 5 7 NA NA

Question 1 (d): For the given factor f← factor(c(”abc”, “abc”, “cab”, “bac”, “abc”, “cab”, “cab”)). What will table(f) return?

Output:

f
abc bac cab 
  3   1   3

Question 1 (e): What are the two compulsory files in a package directory structure?

The two compulsory files in a package directory structure are:

DESCRIPTION: This file contains information about the package such as its name, version, author, description, and dependencies.
NAMESPACE: This file contains information about the package’s namespace, which determines which functions, variables, and other objects are visible to the user when the package is loaded.

Question 1 (f): What is the difference between the functions “read.csv” and “read.csv2”?

Both read.csv and read.csv2 are used to read comma-separated files in R, but the difference between them lies in the default values for the argument sep and dec.
read.csv assumes that the separator in the input file is a comma (,) and the decimal point is a period (.).
read.csv2 assumes that the separator in the input file is a semicolon (;) and the decimal point is a comma (,).

Question 2: Consider “Student” table in a MySQL database ‘db1’:
Student(roll_no, name, city, course)
Write an R script to perform the following tasks:
(i) Load relevant packages to connect with the database.
(ii) Establish the connection with the ‘db1’ database.
(iii) Display all tables of the database ‘db1’.
(iv) Display the total number of students from the ‘Student’ table.
(v) Close the database connection.

Output:

# Load relevant packages
library(RMySQL)

# Establish connection with the 'db1' database
con <- dbConnect(MySQL(), user='username', password='password', dbname='db1', host='localhost')

# Display all tables of the database 'db1'
tables <- dbListTables(con)
print(tables)

# Display the total number of students from the 'Student' table
query <- "SELECT COUNT(*) FROM Student"
result <- dbGetQuery(con, query)
print(result)

# Close the database connection
dbDisconnect(con)

Question 3 (a): Write output for the following command:

switch(5%/%2, sum(2:8), summary(c(’a’, ‘b’)),
sample(10, 5))

The output of the given command will be a vector of 5 random numbers between 1 and 10 generated using the sample() function.

Explanation:

The switch() function in R is used to evaluate different expressions based on the value of a given condition. In this case, the condition is 5 %/% 2, which represents the integer division of 5 by 2.

Let’s break down the different parts of the switch() statement:

sum(2:8): This expression calculates the sum of numbers from 2 to 8, which is 2 + 3 + 4 + 5 + 6 + 7 + 8 = 35. However, since the condition 5 %/% 2 is not equal to 1 (the first case value in the switch() statement), this expression is not selected.
summary(c('a', 'b')): This expression generates a summary of the character vector ‘a’ and ‘b’, which provides information about their lengths and the number of occurrences of each unique value. However, since the condition 5 %/% 2 is not equal to 2 (the second case value in the switch() statement), this expression is not selected.
sample(10, 5): This expression generates a random sample of 5 numbers chosen without replacement from the numbers 1 to 10. The output will be a vector of 5 random numbers. In this case, the condition 5 %/% 2 is equal to 2, which matches the second case value in the switch() statement. Therefore, this expression is selected, and the output will be a vector of 5 randomly chosen numbers between 1 and 10.

Overall, the output of the switch() command will be [10, 1, 5, 2, 4]. This is the output from the selected expression, which is sample(10, 5).

Question 3 (b): Given a list L as:

L <- list(
a=2,
b=3,
twin= c(2, 2),
trip= c(2, 2, 2)
)

what will be the output of the following R statements?
(i) unlist(L)
(ii) lapply(L, length)
(iii) sapply(L, length)

Output:

# (i)
a     b twin1 twin2 trip1 trip2 trip3 
2     3     2     2     2     2     2 

# (ii)
$a
[1] 1

$b
[1] 1

$twin
[1] 2

$trip
[1] 3

# (iii)
a b twin trip 
1 1    2    3

Question 4: Consider the following data frame ‘df’.

SNo	Value	Class
1	98	A
2	21	B
3	67	C
4	23	A
5	11	A
6	12	C
7	34	C
8	56	B
9	78	A
10	90	C
11	12	C

Write an R script to perform the following:
(i) Display the rows of “df” where Class is “A”
(ii) Display the total values for each class.
(iii) Create a suitable plot to show the statistical summary of all values with respect to their class.

# Load the dataset into a data frame
df <- data.frame(SNo = 1:11, Value = c(98, 21, 67, 23, 11, 12, 34, 56, 78, 90, 12), 
                 Class = c("A", "B", "C", "A", "A", "C", "C", "B", "A", "C", "C"))

# (i) Display the rows of “df” where Class is “A”
subset(df, Class == "A")

# (ii) Display the total values for each class.
aggregate(df$Value, by=list(df$Class), FUN=sum)

# (iii) Create a suitable plot to show the statistical summary of all values with respect to their class.
boxplot(df$Value ~ df$Class, col="lightblue", main="Boxplot of Values by Class")

For better understanding, the output of the above code is:

# Output of (i)
  SNo Value Class
1   1    98     A
4   4    23     A
5   5    11     A
9   9    78     A

# Output of (ii)
  Group.1  x
1       A 210
2       B 77
3       C 305

Question 5 (a): Given a data frame “rect” containing the length and height of five rectangles and a function “rect_area” to compute the area of rectangles as:

rect <- data_frame(L=c(10, 5.5, 6, 7.8, 9.7)
B= c(6, 4, 1.2, 3, 4))
rect_area <- function(a, b)
{
 a*b
}

Write an R statement to create a package called “my_area” to compute the area of rectangles using given data frame and function.

For rect_area.R file:

rect_area <- function(a, b) {
  a * b
}

For the “my_area.Rd” file:

\name{rect_area}
\title{Compute the area of a rectangle}
\description{
  This function computes the area of a rectangle given its length and height.
}
\usage{
  rect_area(a, b)
}
\arguments{
  \item{a}{length of rectangle}
  \item{b}{height of rectangle}
}
\value{
  The area of the rectangle.
}

Question 5 (b): For the given vectors ‘x’ and ‘y’,

x <- matrix(rep(1:3, each = 2), nrow = 3, ncol= 2)
y <- matrix(rep(1:3, length out = 6), nrow = 2, ncol= 3)

What will be the output of:
(i) x%%y
(ii) xt(y)

Output:

# Output for x %*% y
     [,1] [,2] [,3]
[1,]    4   10   16
[2,]    4   10   16
[3,]    4   10   16

# Output for x * t(y)
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    2    4    6
[3,]    3    6    9

Question 6: Consider the following dataset that shows the number of times the task 5 is performed by either P1, P2 or jointly by P1 and P2:

Task/Person	P1	P2	Jointly
Laundry	56	34	4
Meal	24	10	4
Cleaning	53	23	20
Dishes	32	56	40
Finances	13	23	70
Driving	10	78	0
Holidays	0	4	0

Write R script to:
(i) Find the tasks which are performed more by the P1 than the P2.
(ii) Display the tasks that are jointly performed by P1 and P2.
(iii) Give a suitable plot to show the frequency of each task performed by P1 and P2. Give appropriate labels and legends.

# create the dataset
task_person <- data.frame(Task_Person = c("Laundry", "Meal", "Cleaning", "Dishes", "Finances", "Driving", "Holidays"),
                           P1 = c(56, 24, 53, 32, 13, 10, 0),
                           P2 = c(34, 10, 23, 56, 23, 78, 4),
                           Jointly = c(4, 4, 20, 40, 70, 0, 0))

# display the tasks which are performed more by the P1 than the P2
task_person[task_person$P1 > task_person$P2, "Task_Person"]

# display the tasks that are jointly performed by P1 and P2
task_person[task_person$Jointly > 0, "Task_Person"]

# create a plot to show the frequency of each task performed by P1 and P2
library(ggplot2)
library(reshape2)

# reshape the data into long format
task_person_long <- melt(task_person, id.vars = "Task_Person")

# create the plot
ggplot(data = task_person_long, aes(x = Task_Person, y = value, fill = variable)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(x = "Task", y = "Frequency", fill = "Person") +
  ggtitle("Frequency of Tasks Performed by P1 and P2") +
  theme(plot.title = element_text(hjust = 0.5))

Question 7 (a): Write an R script to read a file “my_file.txt”:
(i) Headers as in input file,
(ii) Separator as newline character,
(iii) Indicate blank rows as missing values,
(iv) Quoting strings as ‘ ‘.

# Read file with headers, newline separator, and blank rows as missing values
my_data <- read.table("my_file.txt", header = TRUE, sep = "\n", na.strings = "", quote = "'")

Question 7 (b): What will be the output of “f(5)”? Function “f” is defined as follows:

f <- function(x)
{
 f <- function(x)
 {
  print(x^2)
 }
f(x) +1
}

The output of f(5) will be 26.

Tags:

← Previous Lesson Next Lesson →

Introduction to R Programming PYQ 2022 Solution

Question 1 (a): What value will be stored in the variable “X”?X ← vector(”complex”, 3)

Question 1 (b): Write an R statement to extract the rows from a data frame “df” that does not have missing values.

Question 1 (c): Write the output for statements 1 and 2 in the following R script.

Question 1 (d): For the given factor f← factor(c(”abc”, “abc”, “cab”, “bac”, “abc”, “cab”, “cab”)). What will table(f) return?

Question 1 (e): What are the two compulsory files in a package directory structure?

Question 1 (f): What is the difference between the functions “read.csv” and “read.csv2”?

Question 3 (a): Write output for the following command:

Question 3 (b): Given a list L as:

what will be the output of the following R statements?(i) unlist(L)(ii) lapply(L, length)(iii) sapply(L, length)

Question 4: Consider the following data frame ‘df’.

Write an R script to perform the following:(i) Display the rows of “df” where Class is “A”(ii) Display the total values for each class.(iii) Create a suitable plot to show the statistical summary of all values with respect to their class.

Question 5 (a): Given a data frame “rect” containing the length and height of five rectangles and a function “rect_area” to compute the area of rectangles as:

Write an R statement to create a package called “my_area” to compute the area of rectangles using given data frame and function.

Question 5 (b): For the given vectors ‘x’ and ‘y’,

What will be the output of:(i) x%*%y(ii) x*t(y)

Question 6: Consider the following dataset that shows the number of times the task 5 is performed by either P1, P2 or jointly by P1 and P2:

Write R script to:(i) Find the tasks which are performed more by the P1 than the P2.(ii) Display the tasks that are jointly performed by P1 and P2.(iii) Give a suitable plot to show the frequency of each task performed by P1 and P2. Give appropriate labels and legends.

Question 7 (a): Write an R script to read a file “my_file.txt”:(i) Headers as in input file,(ii) Separator as newline character,(iii) Indicate blank rows as missing values,(iv) Quoting strings as ‘ ‘.

Question 7 (b): What will be the output of “f(5)”? Function “f” is defined as follows:

Privacy Policy

Consent

Information we collect

How we use your information

Log Files

Cookies and Web Beacons

Google DoubleClick DART Cookie

Advertising Partners Privacy Policies

Third-Party Privacy Policies

CCPA Privacy Rights (Do Not Sell My Personal Information)

GDPR Data Protection Rights

Children’s Information

Contact

Question 1 (a): What value will be stored in the variable “X”?
X ← vector(”complex”, 3)

what will be the output of the following R statements?
(i) unlist(L)
(ii) lapply(L, length)
(iii) sapply(L, length)

Write an R script to perform the following:
(i) Display the rows of “df” where Class is “A”
(ii) Display the total values for each class.
(iii) Create a suitable plot to show the statistical summary of all values with respect to their class.

What will be the output of:
(i) x%%y
(ii) xt(y)

Write R script to:
(i) Find the tasks which are performed more by the P1 than the P2.
(ii) Display the tasks that are jointly performed by P1 and P2.
(iii) Give a suitable plot to show the frequency of each task performed by P1 and P2. Give appropriate labels and legends.

Question 7 (a): Write an R script to read a file “my_file.txt”:
(i) Headers as in input file,
(ii) Separator as newline character,
(iii) Indicate blank rows as missing values,
(iv) Quoting strings as ‘ ‘.