DATA5100
DATA5100: Data Mining: R Programming
Short Project 1
Scenario
The only data researcher/chemical analyst has resigned at the Blane Research Company. Prior to this person resigning, the wine dataset results from a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars was under analysis for research recommendations for local growers in the same region. The analysis determined the quantities of 14 constituents found in each of the three types of wines.
Today is your first day as a data researcher/chemical analyst at the Blane Research Company. To get you up to speed, your supervisor has directed you to take the wine.csv file and analyze it using R.
Assignment Instructions
Download the wine.csv file from ulearn. Follow the steps below and take screenshots of your output and place in a word document, with a full description of each screenshot taken. When you are complete with the assignment, name your file ShortProject1.doc then submit via the Short Project 1 submission link.
The attributes are as follows:
1) Class
2) Alcohol
3) Malic acid
4) Ash
5) Alcalinity of ash
6) Magnesium
7) Phenols
8) Flavanoids
9) Nonflavanoid phenols
10) Proanthocyanins
11) Color intensity
12) Hue
13) OD280/OD315 of diluted wines
14) Proline
1) Load data into R
1.Open R Studio and set the directory where you have saved wine.csv as the working directory.
2.Load the wine.csv into an R object named data.
2) Examine the data
1.Determine the dimensionality of data.
2.Determine the column names of data.
【DATA5100】3.Determine the structure of data.
4.Determine the attributes of data.
5.List the first 5 rows of data.
6.List the first 5 columns data.
7.List the contents of the Alcohol column in data.
8.Convert last result into a column vector.
9.List the contents of the first 10 rows of the Alcohol column in data.
10.Convert last result into a column vector.
3) Explore Individual Variables
1.Determine the summary (five number summary and mean) of the each of the fourteen variables in the data set (Class, Alcohol, MalicAcid,…).
2.Determine the mean, median, and range of the variable Alcohol.
3.Determine the 0, 25th, 50th, 75th, and 100th percentiles of the variable Phenol.
4.Determine the 10th, 30th, and 65th percentiles of the variable Phenol.
5.Determine the inter-quartile range (IQR) of the variable Hue.
6.Determine the frequency of each data value in the variable Class.
7.View the last result in a pie chart.
8.Determine the variance of the variable Flavanoids.
9.Determine the standard deviation of the variable Flavanoids.
推荐阅读
- 数据库总结语句
- vue组件中为何data必须是一个函数()
- R|R for data Science(六)(readr 进行数据导入)
- 运行报错Cannot|运行报错Cannot find module '@babel/compat-data/corejs3-shipped-proposals’
- 用c#转换word或excel文档为html文件|用c#转换word或excel文档为html文件,C#实现DataSet内数据转化为Excel和Word文件的通用类完整实例...
- 澳洲国立大学|澳洲国立大学 COMP6240 Relational Databases 笔记
- springmvc|springmvc 集成 Spring Data Elasticsearch 遇到的坑
- FormData加axios实现图片上传(多图)
- DataBinding入门进阶指南(一)
- WPF使用代码创建数据模板DataTemplate