数据框(DataFrame)是R的通用数据对象, 用于存储表格数据。数据框被认为是R编程中最流行的数据对象, 因为以表格形式分析数据更加方便。数据帧也可以讲授为床垫, 其中矩阵的每一列可以具有不同的数据类型。 DataFrame由三个主要部分组成, 即数据, 行和列。
文章图片
可以在DataFrame上执行的操作是:
- 创建一个DataFrame
- 访问行和列
- 选择数据框的子集
- 编辑数据框
- 向数据框添加额外的行和列
- 根据现有变量向数据框添加新变量
- 删除数据框中的行和列
要创建数据帧,使用data.frame()命令,然后将创建的每个向量作为参数传递给函数。
例子:
# R program to illustrate dataframe# A vector which is a character vector
Name = c( "Amiya" , "Raj" , "Asish" )# A vector which is a character vector
Language = c( "R" , "Python" , "Java" )# A vector which is a numeric vector
Age = c( 22 , 25 , 45 )# To create dataframe use data.frame command and
# then pass each of the vectors
# we have created as arguments
# to the function data.frame()
df = data.frame(Name, Language, Age)print (df)
输出如下:
NameLanguageAge
1 AmiyaR22
2RajPython25
3 AsishJava45
使用文件中的数据创建数据框:也可以通过从文件导入数据来创建数据框。为此, 你必须使用名为” read.table()‘。
语法如下:
newDF = read.table(path="Path of the file")
要从R中的CSV文件创建数据框, 请执行以下操作:
语法如下:
newDF = read.csv("FileName.csv")
访问行和列 下面给出了访问行和列的语法,
df[val1, val2]df = dataframe object
val1 = rows of a data frame
val2 = columns of a data frame
所以这 ‘ 值1‘和‘值2‘可以是值数组, 例如” 1:2″ 或” 2:3″ 等。如果仅指定df [val2]这仅指你需要从数据框中访问的一组列。
示例:行选择
# R program to illustrate operations
# on a data frame# Creating a dataframe
df = data.frame(
"Name" = c( "Amiya" , "Raj" , "Asish" ), "Language" = c( "R" , "Python" , "Java" ), "Age" = c( 22 , 25 , 45 )
)
print (df)# Accessing first and second row
cat( "Accessing first and second row\n" )
print (df[ 1 : 2 , ])
输出如下:
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45Accessing first and second row
Name Language Age
1 AmiyaR22
2RajPython25
示例:列选择
# R program to illustrate operations
# on a data frame# Creating a dataframe
df = data.frame(
"Name" = c( "Amiya" , "Raj" , "Asish" ), "Language" = c( "R" , "Python" , "Java" ), "Age" = c( 22 , 25 , 45 )
)
print (df)# Accessing first and second column
cat( "Accessing first and second column\n" )
print (df[, 1 : 2 ])
输出如下:
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45Accessing first and second column
Name Language
1 AmiyaR
2RajPython
3 AsishJava
选择数据框的子集 也可以借助以下语法, 根据某些条件创建DataFrame的子集。
newDF =子集(df, 条件)df =原始数据框条件=某些条件例子:
# R program to illustrate operations
# on a data frame# Creating a dataframe
df = data.frame(
"Name" = c( "Amiya" , "Raj" , "Asish" ), "Language" = c( "R" , "Python" , "Java" ), "Age" = c( 22 , 25 , 45 )
)
print (df)# Selecting the subset of the data frame
# where Name is equal to Amiya
# OR age is greater than 30
newDf = subset(df, Name = = "Amiya" |Age>
30 )cat( "After Selecting the subset of the data frame\n" )
print (newDf)
输出如下:
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45After Selecting the subset of the data frame
Name Language Age
1 AmiyaR22
3 AsishJava45
编辑数据框 在R中, 可以通过两种方式编辑DataFrame:
通过直接分配编辑数据框:与R中的列表非常相似, 你可以通过直接分配来编辑数据帧。
例子:
# R program to illustrate operation on a data frame# Creating a dataframe
df = data.frame(
"Name" = c( "Amiya" , "Raj" , "Asish" ), "Language" = c( "R" , "Python" , "Java" ), "Age" = c( 22 , 25 , 45 )
)
cat( "Before editing the dataframe\n" )
print (df)# Editing dataframes by direct assignments
# [[3]] accesing the top level components
# Here Age in this case
# [[3]][3] accessing inner level componets
# Here Age of Asish in this case
df[[ 3 ]][ 3 ] = 30cat( "After edited the dataframe\n" )
print (df)
输出如下:
Before editing the data frame
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45After edited the data frame
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava30
使用来编辑数据框
编辑()
命令:
【R中的DataFrame操作详细指南】请按照给定的步骤编辑DataFrame:
第1步:因此, 你需要为此做的是创建一个数据框实例, 例如, 你可以看到此处使用命令创建了一个数据框实例并将其命名为” myTable” data.frame()这将创建一个空的数据框。
myTable = data.frame()第2步:接下来, 我们将使用编辑功能启动查看器。请注意, ” myTable” 数据帧被传递回” myTable” 对象, 这样, 我们对此模块所做的更改将保存到原始对象。
myTable =编辑(myTable)因此, 当执行以上命令时, 它将弹出一个这样的窗口,
文章图片
第三步
:现在, 表格已包含此小表。
文章图片
请注意, 通过单击变量名称并输入更改来更改变量名称。变量也可以设置为数字或字符。一旦DataFrame中的数据如上所示, 请关闭表。更改将自动保存。
步骤4:通过打印检查结果数据框。
> myTable
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45
将行和列添加到数据框 添加额外的行:我们可以使用以下命令添加额外的行rbind()。语法如下所示,
newDF = rbind(df, 你必须添加的新行的条目)df =原始数据帧请注意, 你必须添加的新行条目在使用时必须小心
rbind()
因为每个列条目中的数据类型应等于已经存在的行的数据类型。
例子:
# R program to illustrate operation on a data frame# Creating a dataframe
df = data.frame(
"Name" = c( "Amiya" , "Raj" , "Asish" ), "Language" = c( "R" , "Python" , "Java" ), "Age" = c( 22 , 25 , 45 )
)
cat( "Before adding row\n" )
print (df)# Add a new row using rbind()
newDf = rbind(df, data.frame(Name = "Sandeep" , Language = "C" , Age = 23
))
cat( "After Added a row\n" )
print (newDf)
输出如下:
Before adding row
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45After Added a row
Name Language Age
1AmiyaR22
2RajPython25
3AsishJava45
4 SandeepC23
添加额外的列:我们可以使用以下命令添加额外的列cbind()。语法如下所示,
newDF = cbind(df, 你必须添加的新列的条目)df =原始数据帧例子:
# R program to illustrate operation on a data frame# Creating a dataframe
df = data.frame(
"Name" = c( "Amiya" , "Raj" , "Asish" ), "Language" = c( "R" , "Python" , "Java" ), "Age" = c( 22 , 25 , 45 )
)
cat( "Before adding column\n" )
print (df)# Add a new column using cbind()
newDf = cbind(df, Rank = c( 3 , 5 , 1 ))cat( "After Added a column\n" )
print (newDf)
输出如下:
Before adding column
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45After Added a column
Name Language Age Rank
1 AmiyaR223
2RajPython255
3 AsishJava451
向DataFrame添加新变量 在R中, 我们可以基于现有变量将新变量添加到数据框。为此, 我们必须先调用dplyr使用命令库图书馆() 。然后打电话mutate()函数将基于现有列添加额外的变量列。
语法如下:
library(dplyr)newDF = mutate(df, new_var = [existing_var])df =原始数据框new_var =新变量的名称existing_var =你要执行的修改操作(例如, 对数值乘以10)例子:
# R program to illustrate operation on a data frame# Importing the dplyr library
library(dplyr)# Creating a dataframe
df = data.frame(
"Name" = c( "Amiya" , "Raj" , "Asish" ), "Language" = c( "R" , "Python" , "Java" ), "Age" = c( 22 , 25 , 45 )
)
cat( "Original Dataframe\n" )
print (df)# Creating an extra variable column
# "log_Age" which is log of variable column "Age"
# Using mutate() command
newDf = mutate(df, log_Age = log(Age))cat( "After creating extra variable column\n" )
print (newDf)
输出如下:
Original Dataframe
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45After creating extra variable column
Name Language Agelog_Age
1 AmiyaR22 3.091042
2RajPython25 3.218876
3 AsishJava45 3.806662
从数据框中删除行和列 要删除行或列, 首先, 你需要访问该行或列, 然后在该行或列之前插入一个负号。它表明你必须删除该行或列。
语法如下:
newDF = df [-rowNo, -colNo] df =原始数据帧例子:
# R program to illustrate operation on a data frame# Creating a dataframe
df = data.frame(
"Name" = c( "Amiya" , "Raj" , "Asish" ), "Language" = c( "R" , "Python" , "Java" ), "Age" = c( 22 , 25 , 45 )
)
cat( "Before deleting the 3rd row and 2nd column\n" )
print (df)# delete the third row and the second column
newDF = df[ - 3 , - 2 ]cat( "After Deleted the 3rd row and 2nd column\n" )
print (newDF)
输出如下:
Before deleting the 3rd row and 2nd column
Name Language Age
1 AmiyaR22
2RajPython25
3 AsishJava45
After Deleted the 3rd row and 2nd column
Name Age
1 Amiya22
2Raj25
推荐阅读
- 数据挖掘基本概念详细介绍
- 编译器中的数据流分析简要指南
- 数据通信中的传输障碍详细指南
- Supervisor多进程管理 异常自动重启 可视化管理
- 使用Maven搭建Struts2框架的开发环境
- #yyds干货盘点#Prometheus 之微服务监控概述
- 2月活动|开工冲冲冲,挑战7/14/21日连更!
- MySQL数据库的高级操作
- 一些 Shell 练习