Unit 6 Machine Learning Algorithms Class 11 AI Code 843 Book Solution

CBSE/NCERT BOOK EXERCISES

A. Multiple Choice / Objective Type Questions

1. Which of the following are the types of correlation?

a. Positive correlation
b. Negative Correlation
c. No correlation
d. All of the above

Ans: d. All of the above

2. Which of the following techniques is an analysis of the relationship between two variables to provide the prediction mechanism?

a. Standard error
b. Correlation
c. Regression
d. None of the above

Ans: c. Regression

3. Which of the given plots is suitable for testing the linear relationship between a dependent and independent variable?

a. Bar chart
b. Scatter plot
c. Histograms
d. All of the above

Ans: b. Scatter plot

4. Which of the following scatter plots represents a positive correlation?

a. points scattered randomly with no apparent trend
b. points forming a diagonal line and bottom left to top right
c. points forming a diagonal line from top left to bottom right
d. points clustered around a central point

Ans: b. points forming a diagonal line and bottom left to top right

5. Which regression technique is used when there is only one independent variable?

a. logistic regression
b. multiple linear regression
c. simple linear regression
d. polynomial regression

Ans: c. simple linear regression

6. What is one advantage of linear regression analysis?

a. it is robust to outliers
b. it can capture nonlinear relationships between variables
c. it is simple and easy to interpret
d. it is suitable for classification tasks

Ans: c. it is simple and easy to interpret

7. What is supervised learning in Artificial Intelligence?

a. training a computer algorithm on input data that is not labelled.
b. training a computer algorithm on input data that has been labelled for a specific output.
c. training a computer algorithm without any input data
d. training a computer algorithm to perform unsupervised tasks.

Ans: b. training a computer algorithm on input data that has been labelled for a specific output.

8. Which type of classification involves categorizing data into two distinct classes?

a. multi-class classification
b. binary classification
c. unsupervised classification
d. regression classification

Ans: b. binary classification

9. What is logistic regression commonly used for in binary classification?

a. categorizing observations into multiple classes
b. predicting continuous values for input data
c. categorizing observations into two distinct classes
d. identifying unstructured data patterns

Ans: c. categorizing observations into two distinct classes

10. What is the primary goal of classification in AI?

a. categorizing data into random groups
b. locating and classifying things or concepts into predefined groups
c. predicting continuous values for input data
d. identifying unstructured data patterns

Ans: b. locating and classifying things or concepts into predefined groups

11. Which algorithm is commonly used for binary classification?

a. Decision trees
b. Support Vector Machine
c. Logistic Regression
d. k-Nearest Neighbors

Ans: c. Logistic Regression

12. The K-Nearest Neighbors (KNN) algorithm assigns a class to new data point by considering:

a. Distance from the data point to a predefined decision boundary
b. Majority vote of its K nearest neighbors in the training data
c. Similarity of the data point to a cluster centroid
d. probability of each class given the data point’s features.

Ans: b. Majority vote of its K nearest neighbors in the training data

13. What does a classification model in AI ultimately want to achieve?

a. to identify patterns and associations in data
b. to predict continuous numerical values
c. to categorize input data into predefined classes or labels
d. to optimize decision-making processes

Ans: c. to categorize input data into predefined classes or label

14. What are some challenges in applying classification models to real-world problems?

a. Data bias and fairness
b. Interpretability and explainability
c. overfitting and underfitting
d. All of the above

Ans: d. All of the above

15. What is clustering?

a. Grouping labeled dataset
b. Dividing data into different clusters
c. Finding linear association between variables
d. Predicting future behaviors of a dependent variable

Ans: b. Dividing data into different clusters

16. Which type of learning does clustering belong to?

a. Supervised learning
b. Unsupervised learning
c. Semi-supervised learning
d. Reinforcement learning

Ans: b. Unsupervised learning

17. Which method is used to group highly dense areas into clusters?

a. Partitioning clustering
b. Density-based clustering
c. Distribution model-based clustering
d. Hierarchical clustering

Ans: c. Distribution model-based clustering

18. Which algorithm is an example of partitioning clustering?

a. Mean-shift algorithm
b. DBSCAN algorithm
c. K-Means algorithm
d. Fuzzy clustering algorithm

Ans: c. K-Means algorithm

19. Which clustering method allows data objects to belong to more than one group or cluster?

a. Partitioning clustering
b. Density-based clustering
c. Distribution model-based clustering
d. Fuzzy clustering

Ans: d. Fuzzy clustering algorithm

20. Which clustering algorithm is sensitive to outliers?

a. K-Means algorithm
b. Mean-shift algorithm
c. DBSCAN algorithm
d. Hierarchical clustering

Ans: a. K-Means algorithm

B. Fill in the blanks

In ___________ type of ML, the models are not trained in labeled data sets.

Ans: Unsupervised Learning

The ___________________ measures the linear relationship between the independent and dependent variables.

Ans: Correlation coefficient

_________________predicts continuous numerical values, while Logistic regression predicts discrete categories.

Ans: Linear Regression

_____________ are data points on the scatterplot that do not follow the pattern of the dataset.

Ans: Outlier

_______________ algorithm operates based on the principle of proximity, making predictions by considering the similarity between data points.

Ans: K-nearest neighbors (KNN) algorithm

Clustering is a machine learning technique used to group ____________ dataset.

Ans: unlabelled dataset

Partitioning clustering divides the data into non-hierarchical groups, also known as ____________ method.

Ans: centroid-based method

Density-based clustering connects highly dense areas into clusters, separated by areas of ____________.

Ans: low point density

The primary requirement for the number of clusters in K-Means algorithm is ____________ beforehand.

Ans: Specified

Clustering is widely used in applications such as market segmentation and ____________.

Ans: Data Analysis

C. True False

Clustering is a supervised learning technique.

Ans: False

Hierarchical Clustering requires pre-specifying the number of clusters.

Ans: False

Fuzzy clustering is a hard clustering method.

Ans: False

Classification is an unsupervised learning technique.

Ans: False

In k-NN algorithm, k is the number of nearest data points.

Ans: True

K-Means algorithm requires specifying the number of clusters.

Ans: True

D. Short Answer Questions

1. What is Machine learning? Name the three methods of machine learning.

Ans: Machine learning (ML) is a type of artificial intelligence (AI) focused on building computer systems that learn from data. It uses algorithms that learn from data to make predictions. The predictions can be generated through three methods known as supervised learning, unsupervised learning and reinforcement learning. In supervised learning algorithms learn patterns from existing data, in unsupervised learning, they discover general patterns in data and reinforcement learning where they learn through reward and punishment methods.

2. How are correlation measures used in AI applications?

Ans: Feature selection: Identify features highly correlated with the target variable, potentially indicating relevance for prediction. Exploratory data analysis: Understand relationships between variables and identify potential or anomalies. Recommender systems: Recommend items based on past user behaviour and correlations between items purchased together.

3. Name some examples of regression algorithms?

Ans: Examples of regression algorithms include Linear Regression, Polynomial Regression, Ridge Regression, Lasso Regression, Elastic Net Regression, Support Vector Regression (SVR), Decision Tree Regression, Random Forest Regression, and Gradient Boosting Regression. These algorithms are used to predict continuous numerical values are widely applied in various fields such as finance, economics and engineering.

4. What are regression algorithms used for?

Ans: Regression algorithms are used for predicting continuous numerical values based on input features. They are widely applied in various fields such as finance for stock price forecasting, economics for predicting economic indicators, healthcare for disease progression estimation, and engineering for predicting product performance. Regression analysis helps uncover relationships between variables and make informed predictions for future data points.

5. What is Linear regression? Give two applications of regression in machine learning?

Ans: Linear Regression is a supervised learning algorithm. It makes use of one independent variable X to predict the outcome of a second dependent variable Y. In machine learning, regression is used to predict outputs and forecast trends.

6. How can outliers impact regression analysis?

Ans: An outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of data which may be collected or it may be the result of experimental error. They can significantly skew the results of regression analysis by distorting the regression line and affecting the accuracy of predictions.

7. What is the primary difference between classification and regression?

Ans: Classification predicts discrete values, while regression predicts continuous values.

8. Provide examples of classification problems in real-life scenarios.

Ans: Examples of classification problems include email spam detection, handwritten character recognition, and sentiment analysis in social media posts.

9. What are some common applications of clustering techniques?

Ans: Common applications of clustering techniques include market segmentation, statistical data analysis, social network analysis, image segmentation, and anomaly detection.

10. List the types of clustering methods.

Ans: Types of clustering methods include partitioning clustering, density-based clustering, distribution model-based clustering, hierarchical clustering, and fuzzy clustering.

E. Long Answer Questions

How does classification model work?

Ans:

Classes or Categories: Data is organized into different groups, such as “positive” and “negative,” representing distinct outcomes.
Features or Attributes: Each data instance is described by specific characteristics or attributes, providing information about the instance.
Training Data: The classification model learns from a dataset containing labeled examples, associating each instance with a class label.
Classification Model: An algorithm or technique is applied to the training data to build a model that can predict the class labels of new instances.
Prediction or Inference: Once trained, the model is used to classify new data instances based on the patterns learned during training.

2. Explain the types of clustering.

Ans:

Partitioning Clustering: This method divides data into non-hierarchical groups using a centroid-based approach, where data points are grouped into k clusters based on the proximity to cluster centroids. Eg: K-Means Clustering algorithm.
Density-Based Clustering: This technique identifies clusters by connecting highly dense areas in the dataset, allowing for arbitrarily shaped clusters to form as long as dense regions are connected. Eg: DBSCAN algorithm.
Distribution Model-Based Clustering: Here, data is clustered based on the probability of belonging to a particular distribution, often assuming Gaussian distributions. Eg: Expectation-Maximization Clustering algorithm, using Gaussian Mixture Models (GMM). •
Hierarchical Clustering: This approach creates a tree-like structure, or dendrogram, to cluster data without requiring the pre-specification of the number of clusters. Eg: Agglomerative Hierarchical algorithm.

3. Write any two advantages and disadvantages of linear regression.

Ans: Advantages of Linear regression

Simple technique and easy to implement
Efficient to train the machine on this model

Disadvantages of Linear regression

Sensitivity to outliers, which can significantly impact the analysis.
Limited to linear relationships between variables.

4. What are the steps involved in k-NN algorithm?

Ans:

Select the number K of the neighbors
Calculate the Euclidean distance of K number of neighbors
Take the K nearest neighbors as per the calculated Euclidean distance.
Among these k neighbors, count the number of the data points in each category.
Assign the new data points to that category for which the number of the neighbor is maximum.
Our model is ready.

5. What are the steps involved in k-means clustering?

Ans:

Select the number K to decide the number of clusters.
Select random K points or centroids. (It can be other from the input dataset).
Assign each data point to their closest centroid, which will form the predefined K clusters.
Calculate the variance and place a new centroid of each cluster.
Repeat the third steps, which means reassign each datapoint to the new closest centroid of each cluster.
If any reassignment occurs, then go to step-4 else go to FINISH.
The model is ready.

F. Competency Based Questions

1. Asmita is developing an AI-driven recommendation system for a retail e-commerce platform. What type of machine learning method might she have used to:

a. Train the model with details of past purchases, user interactions, and product ratings?

b. Identify groups of similar users or products based on their browsing behavior?

Ans: a. Supervised Learning

b. Unsupervised Learning

2. Suppose you are a sales manager tasked with forecasting sales for the upcoming quarter. Describe how you would use linear regression in this scenario, including the data you would collect and the steps involved in the analysis.

Ans: In sales forecasting, linear regression can be used to predict future sales based on historical sales data, marketing spends, seasonality, and other factors. The sales manager would collect historical sales data along with relevant variables such as advertising expenditures, promotional activities, and economic indicators. By analysing this data using linear regression, the sales manager can forecast future sales trends and adjust strategies accordingly.

3. Observe the scatter plot showing the amount of sleep needed per day by age.

What type of correlation is shown here?

Ans: As age increases (moving along the x-axis toward greater numbers), the amount of sleep needed decreases (y-values decreasing). This is a negative correlation. This indicates that as individuals grow older, they generally require less sleep.

4. Ramesh is working on an assignment where he needs to categorize real-world applications of Artificial Intelligence (AI) into two groups: Classification and Clustering. While his initial attempt seems partially correct, his teacher identified a mistake.

Classification	Clustering
Medical Diagnosis	E-mail Spam Detection
Sentiment Analysis	Identifying high risk patient groups
Fraud Detection	Anomaly detection in network traffic

Identify the mistake.

Ans: E-mail Spam Detection is categorized under Clustering, but it should be under Classification.

5. Researchers are developing a new blood test to detect cancer early. The test analyzes various biomarkers (indicators) in a patient’s blood sample. The test results need to be categorized accurately. A positive result should indicate the presence of cancer cells, while a negative result should indicate no cancer. Which type of classification algorithm would be most suitable for this new cancer detection blood test?

Ans: Binary Classification