Stata编程|Stata编程|基础命令

0.常用便捷命令 0.1 levelof

levelsof displays a sorted list of the distinct values of varname
levelsof 可以帮助我们了解指定变量的取值情况
*Synax levelsof varname [if] [in] [, options]

假设我们想知道auto.dta中变量rep78都有哪些取值
cls sysuse auto, clear levelsof rep781 2 3 4 5

0.2 fs
fs lists the names of files in compact form.
fs 可以列示指定路径下的指定类型的文件,结果存储在返回值r(files)中。需要注意的是fs是外部命令,首次使用时需要安装。
*Synax fs [filespec [filespec [ ... ]]]

假设我们想知道某一文件夹内有哪些dta文件
cls ssc install fs cd D:\software\stata16\Stata16MP\ado\base\a fs *.dtaauto.dtaauto2.dtaautornd.dtareturn listmacros: r(files) : ""auto.dta" "auto2.dta" "autornd.dta" "

0.3 rmsg
set rmsg determines whether the return message is to be displayed at the completion of each command. The initial setting is off. The return message shows how long the command took to execute and what time it completed execution.
rmsg可以帮助我们了解代码运行的时间,以便优化代码。
# Synax set rmsg [on | off][, permanently]

假设我们我们想知道从1循环到100000需要多久
cls set rmsg on forvalues k = 1(1)100000{ display `k' }... 99997 99998 99999 100000 r; t=0.87 0:39:24

0.4 log log可以将输入及输出等过程内容保存到文件中
*报告日志文件状态 log log query [logname | _all]*生成并打开日志文件 log using filename [, append replace [text|smcl] name(logname) nomsg]*关闭日志 log close [logname | _all]*暂停或继续日志记录 log off [logname] log on [logname]

使用auto数据集给出一个简单的例子
log using log_auto, name(log1) sysuse auto, clear drop if rep78 ==. save test3, replace log close log1

宏的知识较多,以后再另写推文
1.循环 1.1 while
while evaluates exp and, if it is true (nonzero), executes the stata commands enclosed in the braces. It then repeats the process until exp evaluates to false (zero).
while 是依据表达式的真假进行循环,后面的forvalues和foreach可以理解为是while的
变种。
*Synax while exp { stata_commands }

下面给出单循环和嵌套循环的简单例子
*单循环 local j = 1 while `j' < 10{ display `j' local j = `j' + 1 }*嵌套循环 local i = 1 while `i' <= 5{ local j = 1 while `j' < `i'{ display "`j' 小于 `i'" local j = `j' + 1 } local i = `i' + 1 }

再补充个用while求解方程的例子
local x_est = 0 while abs(`x_est' ^ 2 - 4 * `x_est' + 4 - 0) > 0.0000001{ local x_est = `x_est' + 0.001 } display in red `x_est'2

1.2 forvalues
Loop over consecutive values
forvalues 只能用于数值的循环
*Synax forvalues lname = range { Stata commands referring to `lname' }

对于while中方程求解例子,我们也可以用forvalues来做,假如我们猜到解在1-3范围内
forvalues x_est = 1(0.0001)3{ if abs(`x_est' ^ 2 - 4 * `x_est' + 4 - 0) < 0.000000001{ display in red `x_est' continue, break } }2

1.3 foreach
foreach repeatedly sets local macro lname to each element of the list and executes the commands enclosed in braces. The loop is executed zero or more times; it is executed zero times if the list is null or empty.
foreach后面跟的对象可以是宏、变量名和文件名等,比forvalues的适用性更强。
foreach lname in | of listtype list { commands referring to ‘lname’ }Allowed are foreach lname in any list { foreach lname of local lmacname { foreach lname of global gmacname { foreach lname of varlist varlist { foreach lname of newlist newvarlist { foreach lname of numlist numlist {

【Stata编程|Stata编程|基础命令】假设我们想逐个显示auto.dta中变量make的值,以下两种方式是等价的,但更推荐使用of方式。
cls sysuse auto, clear levelsof make, local(make_info) set rmsg on foreach x of local make_info{ display "`x'" }cls sysuse auto, clear levelsof make, local(make_info) set rmsg on foreach x in `make_info'{ display "`x'" }set rmsg off

通过foreach和其他命令的搭配,我们可以让电脑帮忙做些重复性工作。
*在指定文件夹依次生产名称为2010-2018的excel表格 cd C:\Users\Van\Desktop\test1 foreach file_name of numlist 2010/2018{ putexcel set `file_name'.xlsx, replace putexcel A1 = "Year" putexcel B1 = "Variable" putexcel C1 = "Varlue" }*以上用forvalues实现更简单、速度更快*将上面生产的excel文件转换成dta格式 local xlsx_list: dir . files "*.xlsx" foreach excel_file of local xlsx_list{ display "`excel_file'" import excel using `excel_file', firstrow clear save `excel_file'.dta, replace }#批量计算并查看变量的均值 cls sysuse auto, clear foreach v of varlist price mpg weight length{ quietly summarize `v' display "mean of variable `v' is:"`r(mean)' }

1.4 continue
The continue command within a foreach, forvalues, or while loop breaks execution of the current loop iteration and skips the remaining commands within the loop. Execution resumes at the top of the loop unless the break option is specified, in which case execution resumes with the command following the looping command.
有时在做循环运算时,需要根据某种情况终止循环,此时可以使用continue
*synax continue [, break]

  • continue:中止当前循环余下所用命令,返回上一级循环
  • continue, break:中止全部循环余下所用命令,返回上一级循环
*continue forvalues i = 1(1)5 { disp `i' if `i' >2{ continue } disp "`i':Hello World" }1 1:Hello World 2 2:Hello World 3 4 5*continue, break forvalues i = 1(1)5 { disp `i' if `i' >2{ continue, break } disp "`i':Hello World" }1 1:Hello World 2 2:Hello World 3

2.条件判断
The if command evaluates exp. If the result is true (nonzero), the commands inside the braces are executed. If the result is false (zero), those statements are ignored, and the statement (or statements if enclosed in braces) following the else is executed.
*synax if exp {orif exp single_command multiple_commands }else {orelse single_command multiple_commands }

假设我们想写个计算小程序,当n>0时,表达式为x^n; 当n=0时,表达式是log(n);当n<0时,表达式是-x^n
program power if `2' > 0 { display in red `1'^`2' } else if `2' == 0 { display in red log(`1') } else { display in red -(`1'^(`2')) } endpower 16 2256

3.数据恢复
Preserve and restore data
在数据处理的过程中失误不可避免,这会导致数据被覆,于是我们只能从头处理。面对这种情况,在处理时添加preserve会很有帮助。
*synax preserve [, changed] restore [, not preserve]

我们随机生产一组数据,它是由year和value组成,现在想把同一年的数据单独提取出来分别保存。我们很容易能想到使用如下命令
keep if year == 2015 save 2015.dta, replace

但是执行了上述命令后,原始数据就会被更改,除了2015年外其他年份的样本均会被删除,这使得我们只能写多次keep if 命令。此时,可以使用preserve和restore解决问题。
clear set seed 12345 set obs 10 gen year = _n + 2009 expand 3 gen value = https://www.it610.com/article/uniform() save test2, replaceuse test2, clear forvalues i = 2009/2019{ preserve keep if year == `i' save `i', replace restore }

4.异常处理
capture executes command, suppressing all its output (including error messages, if any) and issues a return code of zero. The actual return code generated by command is stored in the built-in scalar _rc.
在日常数据处理,我们需要程序在遇到错误时自动跳过,此时添加capture即可。
*synax capture [:] commandcapture { stata_commands }

我们可以通过对比以下两段代码返回值差异来认识capture的作用。
hist 杭州 dis "杭州"no variables defined

capture hist 杭州 dis "杭州"杭州

进一步地,我们可以再添加noisily以显示错误情况
capture noisily hist 杭州 dis "杭州"no variables defined 杭州

使用display _rc可以查看错误返回值
capture noisily hist 杭州 dis "杭州" display _rc111

如果代码正常运行有_rc=0
capture noisily display "杭州" display _rc0

capture也可以包括一批命令
capture noisily { di "杭州" error di "上海" }杭州 invalid syntax r(197);

需要注意“上海”没有显示,这是因为一旦报错,capture命令直接跳到了括号外边,结束运行。

    推荐阅读