添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

How could I go about performing a str() function in R on all of these files loaded in the workspace at the same time? I simply want to export this information out, but in a batch-like process, to a .csv file. I have over 100 of them, and want to compare one workspace with another to help locate incongruities in data structure and avoid mismatches.

I came painfully close to a solution via UCLA's R Code Fragment , however, they failed to include the instructions for how to form the read.dta function which loops through the files. That is the part I need help on.

What I have so far:

#Define the file path
f <- file.path("C:/User/Datastore/RData")
#List the files in the path
fn <- list.files(f)
#loop through file list, return str() of each .RData file
#write to .csv file with 4 columns (file name, length, size, value)
Here is an example of what I am after (the view from RStudio--it simply lists the Name, Type, Length, Size, and Value of all of the RData Files).  I want to basically replicate this view, but export it out to a .csv.  I am adding the tag to RStudio in case someone might know a way of exporting this table out automatically?  I couldn't find a way to do it.

Thanks in advance.

Could you please clarify what information you want about the files? str returns information about the R object whereas your desired output looks like information about the raw file (e.g. size) which you would get with file.info. You should provide a small example of the desired output. – cdeterman Oct 15, 2014 at 22:30 if I do str(filename), I should get something like num [1:409(1d)] 0 0 0 0 0 0 0 0 0 0 ... This is enough for me to know that filename is a 1:409 1-dimensional array. That's all I need. However, with that said, what I really want is something like what is listed in RStudio's Global Environment pane (grid view), if you are familiar with that tool. I attempted a screen capture but SO wouldn't let me post it. – myClone Oct 15, 2014 at 23:24

I've actually written a function for this already. I also asked a question about it, and dealing with promise objects with the function. That post might be of some use to you.

The issue with the last column is that str is not meant to do anything but print a compact description of objects and therefore I couldn't use it (but that's been changed with recent edits). This updated function gives a description for the values similar to that of the RStudio table. The data frames and lists are tricky because their str output is more than one line. This should be good.

objInfo <- function(env = globalenv()) 
    obj <- mget(ls(env), env)
    out <- lapply(obj, function(x) {
        vals1 <- c(
            Type = toString(class(x)),  
            Length = length(x),  
            Size = object.size(x)
        val2 <- gsub("|^\\s+|'data.frame':\t", "", capture.output(str(x))[1])
        if(grepl("environment", val2)) val2 <- "Environment"
        c(vals1, Value = val2)
    out <- do.call(rbind, out)
    rownames(out) <- seq_len(nrow(out))
    noquote(cbind(Name = names(obj), out))

And then we can test it out on a few objects..

x <- 1:10
y <- letters[1:5]
e <- globalenv()
df <- data.frame(x = 1, y = "a")
m <- matrix(1:6)
l <- as.list(1:5)
objInfo()
#   Name    Type        Length Size  Value                          
# 1 df      data.frame  2      1208  1 obs. of  2 variables         
# 2 e       environment 11     56    Environment     
# 3 l       list        5      328   List of 5                      
# 4 m       matrix      6      232   int [1:6, 1] 1 2 3 4 5 6       
# 5 objInfo function    1      24408 function (env = globalenv())   
# 6 x       integer     10     88    int [1:10] 1 2 3 4 5 6 7 8 9 10
# 7 y       character   5      328   chr [1:5] a b c d e  

Which is pretty close I guess. Here's the screen shot of the environment in RStudio.

Thanks Richard...this is what I was after. Actually, it's more elaborate than I needed. I really don't need size. But the objectinfo, type, and length are crucial. In reading your posts, I realized that simply typing in ls.str() within RStudio, printed out the information I wanted! – myClone Oct 16, 2014 at 0:13 Never hurts to have a new tool handy. I actually see some functionality in your code that I can borrow for something else I need, so many thanks! – myClone Oct 16, 2014 at 0:15 Just a note, you might want to add a data frame at the end, as I just used a noquote matrix since it was for personal use. – Rich Scriven Oct 16, 2014 at 0:17

I would write a function, something like below. And then loop through that function, so you basically write the code for a single dataset

library(foreign)
giveSingleDataset <- function( oneFile ) {
  #Read .dta file
  df <- read.dta( oneFile )
  #Give e.g. structure
  s <- ls.str(df)
  #Return what you want
  return(s)
#Actually call the function
result <- lapply( fn, giveSingleDataset )
                I don't think that would work the way you'd want it to given the way str prints its output
– Rich Scriven
                Oct 15, 2014 at 22:48
                Please see my comment in response to @cdeterman.  Your function gives the structure of the file itself, not the contents of the data file.  What I was looking for was a way of reading in all of the RData files and obtaining their structure, something like foo_str <- str(foo) would return num [1:409(1d)] 0 0 0 0 0 0 0 0 0 0 ...
– myClone
                Oct 15, 2014 at 23:37
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.